Thalesian's picture
End of training
9f12ba6 verified
metadata
tags:
  - generated_from_trainer
model-index:
  - name: t5-base-p-l-akk-en-20240922-080244
    results: []

t5-base-p-l-akk-en-20240922-080244

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8507

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3.152142797506865e-05
  • train_batch_size: 12
  • eval_batch_size: 12
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.9444 1.1384 2500 0.8638
0.8431 2.2769 5000 0.8085
0.7912 3.4153 7500 0.7750
0.7434 4.5537 10000 0.7531
0.7171 5.6922 12500 0.7395
0.692 6.8306 15000 0.7278
0.6596 7.9690 17500 0.7165
0.6155 9.1075 20000 0.7231
0.61 10.2459 22500 0.7129
0.5886 11.3843 25000 0.7068
0.5718 12.5228 27500 0.7084
0.5519 13.6612 30000 0.7029
0.5412 14.7996 32500 0.7007
0.5241 15.9381 35000 0.7017
0.5026 17.0765 37500 0.7134
0.4733 18.2149 40000 0.7038
0.489 19.3534 42500 0.7067
0.4666 20.4918 45000 0.7083
0.4494 21.6302 47500 0.7061
0.4545 22.7687 50000 0.7092
0.4357 23.9071 52500 0.7116
0.4332 25.0455 55000 0.7189
0.4152 26.1840 57500 0.7207
0.3995 27.3224 60000 0.7196
0.3976 28.4608 62500 0.7184
0.3879 29.5993 65000 0.7210
0.3812 30.7377 67500 0.7243
0.3749 31.8761 70000 0.7241
0.3663 33.0146 72500 0.7320
0.3612 34.1530 75000 0.7344
0.3469 35.2914 77500 0.7377
0.3407 36.4299 80000 0.7388
0.3309 37.5683 82500 0.7411
0.3354 38.7067 85000 0.7354
0.3252 39.8452 87500 0.7407
0.3167 40.9836 90000 0.7435
0.3182 42.1220 92500 0.7502
0.2994 43.2605 95000 0.7547
0.3064 44.3989 97500 0.7561
0.2923 45.5373 100000 0.7529
0.2848 46.6758 102500 0.7593
0.2843 47.8142 105000 0.7600
0.279 48.9526 107500 0.7650
0.2781 50.0911 110000 0.7706
0.2629 51.2295 112500 0.7730
0.2639 52.3679 115000 0.7726
0.2624 53.5064 117500 0.7791
0.2547 54.6448 120000 0.7776
0.2567 55.7832 122500 0.7747
0.2484 56.9217 125000 0.7792
0.2454 58.0601 127500 0.7893
0.2398 59.1985 130000 0.7864
0.2313 60.3370 132500 0.7973
0.2362 61.4754 135000 0.7964
0.2359 62.6138 137500 0.7962
0.226 63.7523 140000 0.8009
0.2271 64.8907 142500 0.8027
0.2249 66.0291 145000 0.8014
0.2212 67.1676 147500 0.8077
0.2129 68.3060 150000 0.8088
0.2131 69.4444 152500 0.8108
0.2106 70.5829 155000 0.8144
0.2078 71.7213 157500 0.8163
0.2103 72.8597 160000 0.8148
0.2025 73.9982 162500 0.8215
0.2023 75.1366 165000 0.8250
0.197 76.2750 167500 0.8267
0.1945 77.4135 170000 0.8274
0.1919 78.5519 172500 0.8289
0.187 79.6903 175000 0.8308
0.1948 80.8288 177500 0.8339
0.1857 81.9672 180000 0.8346
0.191 83.1056 182500 0.8380
0.1796 84.2441 185000 0.8387
0.1862 85.3825 187500 0.8414
0.185 86.5209 190000 0.8409
0.1778 87.6594 192500 0.8434
0.1824 88.7978 195000 0.8426
0.1735 89.9362 197500 0.8443
0.1737 91.0747 200000 0.8474
0.1787 92.2131 202500 0.8462
0.1759 93.3515 205000 0.8484
0.1744 94.4900 207500 0.8487
0.1778 95.6284 210000 0.8502
0.1767 96.7668 212500 0.8507
0.175 97.9053 215000 0.8499
0.1723 99.0437 217500 0.8507

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1