|
--- |
|
license: apache-2.0 |
|
language: |
|
- ar |
|
pipeline_tag: text-generation |
|
tags: |
|
- 'arabic ' |
|
- text-generation |
|
widget: |
|
- text: "أعلنت وزارة الحج في المملكة العربية السعودية" |
|
example_title: "مثال ١" |
|
- text: "يبدو اليوم جميلا، سأقوم بتحضير" |
|
example_title: "مثال ٢" |
|
- text: "إن التقنيات الحديثة" |
|
example_title: "مثال ٣" |
|
--- |
|
# ArabianGPT Model Overview |
|
|
|
## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation |
|
|
|
<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.1B, and users engage with and apply the model's outputs at their own risk.</p> |
|
|
|
> **Important Note:** Currently, we offer a raw pre-trained model. Our team is actively working on releasing instruction-based LLMs that are fine-tuned and augmented with LRHF. The first set of pre-trained models has been made available for community exploration. While we do have models fine-tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase. |
|
|
|
|
|
## Introduction |
|
ArabianGPT-0.1B, developed under the ArabianLLM initiatives, is a specialized GPT-2 model optimized for Arabic language modeling. |
|
It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic. |
|
This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language. |
|
|
|
## Key Features |
|
- **Architecture**: GPT-2 |
|
- **Model Size**: 134 million parameters |
|
- **Layers**: 12 |
|
- **Model Attention Layers (MAL)**: 12 |
|
- **Context Window Size**: 768 tokens |
|
|
|
## Training |
|
- **Dataset**: Scraped Arabic newspaper articles |
|
- **Data Size**: 15.5 GB |
|
- **Words**: 237.8 million |
|
- **Tokenizer**: Aranizer 64K |
|
- **Tokens**: Over 1.75 billion |
|
- **Hardware**: 2 NDIVIA A100 GPUs |
|
- **Training Scale**: 7.5 million examples |
|
- **Training Duration**: 3 days |
|
- **Performance**: Final loss of 3.97 |
|
|
|
|
|
## Role in ArabianLLM Initiatives |
|
ArabianGPT-0.1B (Base Model) is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects. |
|
|
|
## Usage |
|
Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline: |
|
```python |
|
from transformers import pipeline |
|
|
|
pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-01B", max_new_tokens=512) |
|
text = '' |
|
pipe.predict(text) |
|
``` |
|
|
|
## Limitations and Ethical Considerations |
|
|
|
- The model may have context understanding or text generation limitations in certain scenarios. |
|
- Emphasis on ethical use to prevent misinformation or harmful content propagation. |
|
|
|
## Acknowledgments |
|
|
|
Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab. |
|
|
|
## Contact Information |
|
|
|
For inquiries: [riotu@psu.edu.sa](mailto:riotu@psu.edu.sa). |
|
|
|
## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation |
|
|
|
<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.1B, and users engage with and apply the model's outputs at their own risk.</p> |
|
|