Kumshe's picture
Training complete
db211ad verified
metadata
library_name: transformers
license: apache-2.0
base_model: google-t5/t5-small
tags:
  - translation
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: t5-small-finetuned-english-to-hausa
    results: []

t5-small-finetuned-english-to-hausa

This model is a fine-tuned version of google-t5/t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7088
  • Bleu: 71.7187
  • Gen Len: 14.3652

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0008
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 3000
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
3.1612 1.0 749 1.7523 32.7424 15.2302
1.5573 2.0 1498 1.0553 53.4401 14.5568
1.0462 3.0 2247 0.7899 60.8893 14.71
0.8071 4.0 2996 0.6780 64.3438 14.4066
0.6602 5.0 3745 0.6089 66.0887 14.127
0.5562 6.0 4494 0.5741 66.8902 14.1295
0.4872 7.0 5243 0.5497 68.4261 14.3395
0.4299 8.0 5992 0.5412 68.9385 14.3446
0.3872 9.0 6741 0.5377 69.5675 14.2603
0.3478 10.0 7490 0.5356 70.0045 14.3615
0.3147 11.0 8239 0.5312 70.1895 14.4524
0.2848 12.0 8988 0.5484 70.8151 14.366
0.2584 13.0 9737 0.5523 70.6127 14.2939
0.2342 14.0 10486 0.5642 70.7368 14.3301
0.2122 15.0 11235 0.5775 70.9399 14.3635
0.1928 16.0 11984 0.5935 71.2577 14.352
0.1757 17.0 12733 0.5964 71.2056 14.3929
0.1608 18.0 13482 0.6085 71.0265 14.3877
0.1475 19.0 14231 0.6219 71.5491 14.3812
0.1352 20.0 14980 0.6285 71.5971 14.3675
0.1237 21.0 15729 0.6468 71.4863 14.3782
0.1142 22.0 16478 0.6652 71.5849 14.3734
0.1082 23.0 17227 0.6733 71.6037 14.3298
0.0998 24.0 17976 0.6852 71.6926 14.4066
0.0962 25.0 18725 0.6899 71.7003 14.358
0.0915 26.0 19474 0.6994 71.6191 14.3702
0.0882 27.0 20223 0.7033 71.5731 14.3537
0.0857 28.0 20972 0.7084 71.6407 14.3618
0.0853 29.0 21721 0.7086 71.7115 14.3635
0.0847 30.0 22470 0.7088 71.7187 14.3652

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1