Edit model card

gpt2-small-amharic-8k-128-v3

This is a smaller version of the gpt2 decoder transformer model pretrained from scratch for 1.5 days on 290 million tokens of Amharic text.

  • It has 29.5 Million parameters
  • The context size of this model is 128 tokens.
  • It has the same tokenizer as gpt2, trained from scratch using the same dataset with a vocabulary size of 8192.
  • This is a base model and hasn't undergone any supervised finetuing yet.

It achieves the following results on the evaluation set:

  • Loss: 3.59
  • Perplexity: 36.23

Demo

You can use the following demo to generate text using gpt2-small-amharic. Please enter a prompt and click the Generate button to generate completions for the prompt.

https://huggingface.co/spaces/rasyosef/GPT2-Amharic

Downloads last month
27
Safetensors
Model size
29.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including rasyosef/gpt2-small-amharic-8k-128-v3