BlueBeagle's picture
End of training
c4a562a
|
raw
history blame
8.02 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: t5-small-finetuned-xsum
    results: []

t5-small-finetuned-xsum

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2337
  • Rouge1: 53.8111
  • Rouge2: 48.12
  • Rougel: 53.2346
  • Rougelsum: 53.7215
  • Gen Len: 14.8824

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 64

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 2 2.3643 7.1906 2.2624 6.694 6.615 16.5
No log 2.0 4 2.3231 6.7807 2.2624 6.3002 6.2663 16.3824
No log 3.0 6 2.2494 13.1859 9.2308 12.8335 12.7109 16.2647
No log 4.0 8 2.1885 12.7975 9.2308 12.8564 12.7446 16.1765
No log 5.0 10 2.1366 16.066 11.7324 15.8326 15.8873 16.1176
No log 6.0 12 2.0779 16.066 11.7324 15.8326 15.8873 16.1471
No log 7.0 14 2.0315 16.3696 12.1611 16.2055 16.2766 16.3235
No log 8.0 16 1.9897 16.3696 12.1611 16.2055 16.2766 16.1765
No log 9.0 18 1.9518 16.3696 12.1611 16.2055 16.2766 16.2353
No log 10.0 20 1.9328 16.3696 12.1611 16.2055 16.2766 16.2353
No log 11.0 22 1.8989 16.6256 12.4154 16.4518 16.3985 16.1176
No log 12.0 24 1.8665 17.121 12.4154 16.7837 16.8028 16.1176
No log 13.0 26 1.8327 19.2998 14.1 19.1944 19.058 15.8529
No log 14.0 28 1.7979 21.842 16.546 21.7303 21.6127 15.7647
No log 15.0 30 1.7690 21.9626 16.546 21.7948 21.7027 15.7647
No log 16.0 32 1.7449 21.9626 16.546 21.7948 21.7027 15.7647
No log 17.0 34 1.7193 22.3102 16.546 22.0997 22.0562 15.7647
No log 18.0 36 1.6982 24.848 19.412 24.7592 24.7737 15.6765
No log 19.0 38 1.6780 24.848 19.412 24.7592 24.7737 15.6176
No log 20.0 40 1.6577 27.0829 21.4796 27.1757 26.989 15.6765
No log 21.0 42 1.6375 27.1459 21.4796 27.2292 27.0625 15.6471
No log 22.0 44 1.6175 27.1459 21.4796 27.2292 27.0625 15.3235
No log 23.0 46 1.5991 27.1459 21.4796 27.2292 27.0625 15.3235
No log 24.0 48 1.5823 29.4051 23.5934 29.8009 29.3756 16.4412
No log 25.0 50 1.5651 29.4051 23.5934 29.8009 29.3756 16.4412
No log 26.0 52 1.5493 29.4051 23.5934 29.8009 29.3756 16.4412
No log 27.0 54 1.5326 29.7971 23.5934 30.2055 29.6775 16.4118
No log 28.0 56 1.5166 29.7971 23.5934 30.2055 29.6775 16.4118
No log 29.0 58 1.5012 29.2529 23.024 29.3423 28.8761 16.2353
No log 30.0 60 1.4866 29.3066 23.024 29.3532 28.9278 16.1765
No log 31.0 62 1.4715 34.2571 28.0288 34.5418 34.2628 15.6471
No log 32.0 64 1.4554 34.6223 28.0288 34.7957 34.6712 15.6176
No log 33.0 66 1.4396 34.1994 28.0288 34.3535 34.1686 15.5
No log 34.0 68 1.4239 34.8089 28.6815 34.9115 34.9247 15.4706
No log 35.0 70 1.4105 36.6837 30.6181 36.6982 36.5407 15.2647
No log 36.0 72 1.4044 36.6837 30.6181 36.6982 36.5407 15.2647
No log 37.0 74 1.3915 41.2554 35.9199 41.138 41.3591 15.1765
No log 38.0 76 1.3779 41.2554 35.9199 41.1075 41.3158 15.2059
No log 39.0 78 1.3646 41.2877 35.9199 41.12 41.3654 14.9412
No log 40.0 80 1.3518 41.3288 35.9199 41.2095 41.4213 14.9118
No log 41.0 82 1.3399 41.6286 36.3194 41.6244 41.9893 14.8824
No log 42.0 84 1.3282 41.6286 36.3194 41.6244 41.9893 14.8824
No log 43.0 86 1.3168 42.802 38.3443 43.026 43.1784 14.6176
No log 44.0 88 1.3062 42.802 38.3443 43.026 43.1784 14.8529
No log 45.0 90 1.2971 44.3698 39.6695 44.3145 44.7449 14.6471
No log 46.0 92 1.2884 47.2079 41.807 46.9035 47.806 15.0294
No log 47.0 94 1.2814 48.006 42.6861 47.6281 48.5116 14.8529
No log 48.0 96 1.2753 48.6509 42.6861 48.2044 48.9901 15.0
No log 49.0 98 1.2693 48.6509 42.6861 48.2044 48.9901 15.0
No log 50.0 100 1.2643 50.6717 45.4929 50.5974 50.8919 14.8529
No log 51.0 102 1.2603 51.3096 45.4929 51.1016 51.4281 15.0
No log 52.0 104 1.2566 51.3096 45.4929 51.1016 51.4281 15.0588
No log 53.0 106 1.2534 52.4682 47.1163 52.2336 52.3783 14.9706
No log 54.0 108 1.2498 52.4682 47.1163 52.2336 52.3783 14.9706
No log 55.0 110 1.2465 52.4682 47.1163 52.2336 52.3783 14.9706
No log 56.0 112 1.2440 53.5647 48.4619 53.5119 53.4541 14.7353
No log 57.0 114 1.2417 53.8111 48.12 53.2346 53.7215 14.8824
No log 58.0 116 1.2398 53.8111 48.12 53.2346 53.7215 14.8824
No log 59.0 118 1.2383 53.8111 48.12 53.2346 53.7215 14.8824
No log 60.0 120 1.2368 53.8111 48.12 53.2346 53.7215 14.8824
No log 61.0 122 1.2360 53.8111 48.12 53.2346 53.7215 14.8824
No log 62.0 124 1.2350 53.8111 48.12 53.2346 53.7215 14.8824
No log 63.0 126 1.2343 53.8111 48.12 53.2346 53.7215 14.8824
No log 64.0 128 1.2337 53.8111 48.12 53.2346 53.7215 14.8824

Framework versions

  • Transformers 4.32.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3