t5_base-qg-aap-test / README.md
tiagoblima's picture
End of training
0ef60d4 verified
metadata
license: mit
base_model: unicamp-dl/ptt5-base-t5-vocab
tags:
  - generated_from_trainer
datasets:
  - tiagoblima/preprocessed-du-qg-squadv1_pt
model-index:
  - name: t5_base-qg-aap-test
    results: []

t5_base-qg-aap-test

This model is a fine-tuned version of unicamp-dl/ptt5-base-t5-vocab on the tiagoblima/preprocessed-du-qg-squadv1_pt dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0278

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 8.5434
No log 2.0 2 7.3013
No log 3.0 3 6.1993
No log 4.0 4 5.2898
No log 5.0 5 4.5226
No log 6.0 6 3.9202
No log 7.0 7 3.4436
No log 8.0 8 3.0408
No log 9.0 9 2.7138
No log 10.0 10 2.4436
No log 11.0 11 2.2130
No log 12.0 12 2.0190
No log 13.0 13 1.8451
No log 14.0 14 1.6746
No log 15.0 15 1.5047
No log 16.0 16 1.3376
No log 17.0 17 1.1800
No log 18.0 18 1.0434
No log 19.0 19 0.9442
No log 20.0 20 0.8739
No log 21.0 21 0.8163
No log 22.0 22 0.7629
No log 23.0 23 0.7118
No log 24.0 24 0.6618
No log 25.0 25 0.6104
No log 26.0 26 0.5597
No log 27.0 27 0.5112
No log 28.0 28 0.4657
No log 29.0 29 0.4243
No log 30.0 30 0.3873
No log 31.0 31 0.3529
No log 32.0 32 0.3209
No log 33.0 33 0.2918
No log 34.0 34 0.2667
No log 35.0 35 0.2436
No log 36.0 36 0.2215
No log 37.0 37 0.2004
No log 38.0 38 0.1808
No log 39.0 39 0.1637
No log 40.0 40 0.1484
No log 41.0 41 0.1357
No log 42.0 42 0.1252
No log 43.0 43 0.1159
No log 44.0 44 0.1079
No log 45.0 45 0.0997
No log 46.0 46 0.0922
No log 47.0 47 0.0858
No log 48.0 48 0.0802
No log 49.0 49 0.0749
No log 50.0 50 0.0703
No log 51.0 51 0.0660
No log 52.0 52 0.0626
No log 53.0 53 0.0596
No log 54.0 54 0.0573
No log 55.0 55 0.0555
No log 56.0 56 0.0539
No log 57.0 57 0.0521
No log 58.0 58 0.0506
No log 59.0 59 0.0496
No log 60.0 60 0.0485
No log 61.0 61 0.0472
No log 62.0 62 0.0460
No log 63.0 63 0.0445
No log 64.0 64 0.0432
No log 65.0 65 0.0421
No log 66.0 66 0.0409
No log 67.0 67 0.0396
No log 68.0 68 0.0385
No log 69.0 69 0.0375
No log 70.0 70 0.0365
No log 71.0 71 0.0358
No log 72.0 72 0.0350
No log 73.0 73 0.0344
No log 74.0 74 0.0338
No log 75.0 75 0.0334
No log 76.0 76 0.0329
No log 77.0 77 0.0326
No log 78.0 78 0.0321
No log 79.0 79 0.0317
No log 80.0 80 0.0314
No log 81.0 81 0.0310
No log 82.0 82 0.0306
No log 83.0 83 0.0303
No log 84.0 84 0.0299
No log 85.0 85 0.0297
No log 86.0 86 0.0294
No log 87.0 87 0.0292
No log 88.0 88 0.0290
No log 89.0 89 0.0288
No log 90.0 90 0.0287
No log 91.0 91 0.0285
No log 92.0 92 0.0284
No log 93.0 93 0.0282
No log 94.0 94 0.0281
No log 95.0 95 0.0281
No log 96.0 96 0.0280
No log 97.0 97 0.0279
No log 98.0 98 0.0279
No log 99.0 99 0.0278
0.8788 100.0 100 0.0278

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.0
  • Datasets 2.15.0
  • Tokenizers 0.15.0