ArabianGPT-01B / README.md
riotu-lab's picture
Update readme.md
bd04aa6 verified
|
raw
history blame
3.29 kB
metadata
license: apache-2.0
language:
  - ar
pipeline_tag: text-generation
tags:
  - 'arabic '
  - text-generation
widget:
  - text: أعلنت وزارة الحج في المملكة العربية السعودية
    example_title: مثال ١
  - text: يبدو اليوم جميلا، سأقوم بتحضير
    example_title: مثال ٢
  - text: إن التقنيات الحديثة
    example_title: مثال ٣

ArabianGPT Model Overview

Disclaimer for the Use of Large Language Models (LLMs) for Text Generation

We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.1B, and users engage with and apply the model's outputs at their own risk.

Important Note: Currently, we offer a raw pre-trained model. Our team is actively working on releasing instruction-based LLMs that are fine-tuned and augmented with LRHF. The first set of pre-trained models has been made available for community exploration. While we do have models fine-tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase.

Introduction

ArabianGPT-0.1B, developed under the ArabianLLM initiatives, is a specialized GPT-2 model optimized for Arabic language modeling. It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic. This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language.

Key Features

  • Architecture: GPT-2
  • Model Size: 134 million parameters
  • Layers: 12
  • Model Attention Layers (MAL): 12
  • Context Window Size: 768 tokens

Training

  • Dataset: Scraped Arabic newspaper articles
  • Data Size: 15.5 GB
  • Words: 237.8 million
  • Tokenizer: Aranizer 64K
  • Tokens: Over 1.75 billion
  • Hardware: 2 NDIVIA A100 GPUs
  • Training Scale: 7.5 million examples
  • Training Duration: 3 days
  • Performance: Final loss of 3.97

Role in ArabianLLM Initiatives

ArabianGPT-0.1B (Base Model) is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.

Usage

Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline:

from transformers import pipeline

pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-01B", max_new_tokens=512)
text = ''
pipe.predict(text)

Limitations and Ethical Considerations

  • The model may have context understanding or text generation limitations in certain scenarios.
  • Emphasis on ethical use to prevent misinformation or harmful content propagation.

Acknowledgments

Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.

Contact Information

For inquiries: riotu@psu.edu.sa.

Disclaimer for the Use of Large Language Models (LLMs) for Text Generation

We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.1B, and users engage with and apply the model's outputs at their own risk.