Edit model card

XLM-R and EXLMR Model

Model Overview

The XLM-R (Cross-lingual Language Model - RoBERTa) is a multilingual model trained on 100 languages. The EXLMR (Extended XLM-RoBERTa) is an extended version designed to improve performance on low-resource languages spoken in Ethiopia, including Amharic, Tigrinya, and Afaan Oromo.

Model Details

  • Base Model: XLM-R
  • Extended Version: EXLMR
  • Languages Supported: Amharic, Tigrinya, Afaan Oromo, and more
  • Training Data: Trained on a large multilingual corpus

Usage

EXLMR addresses tokenization issues inherent to the XLM-R model, such as out-of-vocabulary (OOV) tokens and over-tokenization, especially for low-resource languages. Fine-tuning on specific datasets will help adapt the model to particular tasks and improve its performance.You can use this model with the transformers library for various NLP tasks.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Define the model checkpoint
checkpoint = "Hailay/EXLMR"  

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)


EXLMR has been designed with specific support for underrepresented languages, particularly those spoken in Ethiopia (such as Amharic, Tigrinya, and Afaan Oromo). Like XLM-RoBERTa, EXLMR can be finetuned to handle multiple languages simultaneously, making it effective for cross-lingual tasks such as machine translation, multilingual text classification, and question answering.EXLMR-base follows the same architecture as RoBERTa-base, with 12 layers, 768 hidden dimensions, and 12 attention heads, totaling approximately 270M parameters.

|Model|Vocabulary Size|
|---|---|
|XLM-Roberta|250002|
|EXLMR|280147|
Downloads last month
81
Safetensors
Model size
301M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for Hailay/EXLMR

Finetuned
this model

Datasets used to train Hailay/EXLMR