metadata

language: ti
widget:
  - text: ዓቕሚ ደቀንስትዮ [MASK] ብግብሪ ተራእዩ

BERT Base for Tigrinya Language

We pretrain a BERT base-uncased model on a relatively small dataset for Tigrinya (34M tokens) for 40 epochs.

Contained in this card is a PyTorch model exported from the original model that was trained on a TPU v3.8 with Flax.

Hyperparameters

The hyperparameters corresponding to model sizes mentioned above are as follows:

Model Size	L	AH	HS	FFN	P
BASE	12	12	768	3072	110M

(L = number of layers; AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters.)