tibert-base / README.md
fgaim's picture
Update README
9cf8654
|
raw
history blame
743 Bytes
metadata
language: ti
widget:
  - text: ዓቕሚ ደቀንስትዮ [MASK] ብግብሪ ተራእዩ

BERT Base for Tigrinya Language

We pretrain a BERT base-uncased model on a relatively small dataset for Tigrinya (34M tokens) for 40 epochs.

Contained in this card is a PyTorch model exported from the original model that was trained on a TPU v3.8 with Flax.

Hyperparameters

The hyperparameters corresponding to model sizes mentioned above are as follows:

Model Size L AH HS FFN P
BASE 12 12 768 3072 110M

(L = number of layers; AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters.)