This is a tiny model used for testing language models from scratch for Indic Languages. Started with assamese, as the data was short(and trimmed it to be in limits of Google Colab Free Tier) The final goal is to do this for other indic languages, and to use BART architecture, to extend IndicBART.
The model uses RoBERTa, with Byte Level Byte Pair Encoding for the Tokenizer part
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.