File size: 3,217 Bytes
e5fb8a6
 
9a8b5bb
 
 
 
 
 
 
 
 
41ffa55
e60a3eb
41ffa55
43374ea
41ffa55
e5fb8a6
 
9a8b5bb
e5fb8a6
 
 
 
 
9a8b5bb
e5fb8a6
9a8b5bb
 
 
 
e5fb8a6
 
 
 
9a8b5bb
e5fb8a6
9a8b5bb
e5fb8a6
9a8b5bb
e5fb8a6
 
 
9a8b5bb
e5fb8a6
 
 
9a8b5bb
e5fb8a6
 
 
9a8b5bb
e5fb8a6
 
 
9a8b5bb
e5fb8a6
9a8b5bb
 
 
e5fb8a6
9a8b5bb
 
f16b00e
e5fb8a6
f16b00e
 
 
e5fb8a6
9a8b5bb
21d3724
f16b00e
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
library_name: transformers
tags:
- NLP
- Machine Translation
- Moroccan Arabic
- Darija
- Modern Standard Arabic
- MSA
- AraT5
pipeline_tag: translation
widget:
- text: "آه، يالاه رجعت من شهر العسل ديالي في شفشاون"
  example_title: "Example 1"
- text: "واش ممكن تعاونني؟ محتاج لمساعدة ديالك"
  example_title: "Example 2"
---

# Model Card for AraT5 - Moroccan Arabic to Modern Standard Arabic Translation

## Model Details

### Model Description

This model card presents a 🤗 transformers model designed for translating Moroccan Arabic (Darija) into Modern Standard Arabic (MSA). The model is fine-tuned from AraT5 base 1024.

- **Developed by:** Said ET-TOUSY.
- **Model type:** Fine-tuned language translation model
- **Language(s) (NLP):** Moroccan Arabic (Darija), Modern Standard Arabic (MSA)
- **Finetuned from model :** AraT5 base 1024


### Direct Use

This model is intended to be used directly for translating text from Moroccan Arabic (Darija) to Modern Standard Arabic (MSA). It can be deployed in various applications requiring translation services.

### Downstream Use 

The model can also be fine-tuned for specific downstream tasks related to Moroccan Arabic and Modern Standard Arabic. This could include domain-specific translations or integration into larger NLP systems.

### Out-of-Scope Use

While the model is designed for translation between Moroccan Arabic and Modern Standard Arabic, it may not perform well on other language pairs or tasks unrelated to translation.

## Bias, Risks, and Limitations

The model's performance may be influenced by biases present in the training data, such as the representation of certain dialectal variations or cultural nuances. Additionally, the model's accuracy may vary depending on the complexity of the text being translated and the presence of out-of-vocabulary words.

### Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. Careful evaluation of translated outputs, especially in sensitive or critical applications, is recommended. Furthermore, continuous monitoring and updating of the model with new data can help mitigate biases and improve performance over time.

## How to Get Started with the Model

To get started with the model, follow the steps below:

1. Install the transformers library.
2. Load the pre-trained AraT5_Darija_to_MSA model fine-tuned for Moroccan Arabic to Modern Standard Arabic translation.
3. Use the model to translate text from Moroccan Arabic to Modern Standard Arabic.

```python
# Example code
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

>>> model_name = "Saaidtaoussi/AraT5_Darija_to_MSA"
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)
>>> model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Example translation
>>> input_text = "آه، يالاه رجعت من شهر العسل ديالي في شفشاون"
>>> inputs = tokenizer(input_text, return_tensors="pt", padding=True)
>>> translated = model.generate(**inputs)
>>> output_text = tokenizer.decode(translated[0], skip_special_tokens=True)
>>> print(output_text)
```