BlueBeagle's picture
End of training
1444677
|
raw
history blame
8.03 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: t5-small-finetuned-xsum
    results: []

t5-small-finetuned-xsum

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3773
  • Rouge1: 76.5735
  • Rouge2: 74.0611
  • Rougel: 76.9279
  • Rougelsum: 76.7502
  • Gen Len: 12.3684

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 64

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 3 2.3350 8.9016 4.9624 8.561 8.521 17.0789
No log 2.0 6 2.2082 8.8982 4.9624 8.5474 8.4807 16.8158
No log 3.0 9 2.0239 6.3522 2.8822 5.9706 5.9968 16.3947
No log 4.0 12 1.9026 6.3122 2.8822 5.7577 5.8666 16.5526
No log 5.0 15 1.7655 6.7889 2.8822 6.229 6.3143 16.3684
No log 6.0 18 1.6270 6.2127 3.2331 5.5808 5.6566 15.6579
No log 7.0 21 1.5019 6.2816 3.3198 5.6661 5.7835 15.3158
No log 8.0 24 1.4107 6.2816 3.3198 5.6661 5.7835 15.3421
No log 9.0 27 1.3240 10.8854 6.6397 10.1852 10.2678 15.4211
No log 10.0 30 1.2590 15.747 13.6747 15.9617 16.0821 15.1053
No log 11.0 33 1.1988 18.5812 15.8877 18.637 18.8755 15.0789
No log 12.0 36 1.1513 14.0714 11.6088 14.2061 14.0938 14.7632
No log 13.0 39 1.1108 16.0599 13.3917 16.225 15.9864 14.8947
No log 14.0 42 1.0766 17.8985 15.6223 18.4261 18.2524 14.9474
No log 15.0 45 1.0448 17.8985 15.6223 18.4261 18.2524 14.9474
No log 16.0 48 1.0137 17.8985 15.6223 18.4261 18.2524 14.9474
No log 17.0 51 0.9843 17.8985 15.6223 18.4261 18.2524 14.6316
No log 18.0 54 0.9598 27.5385 24.73 27.5826 27.7639 14.4474
No log 19.0 57 0.9313 28.9525 25.8784 29.148 29.1478 14.4474
No log 20.0 60 0.9001 29.7391 26.6691 30.1382 30.1101 14.4737
No log 21.0 63 0.8695 31.6294 28.402 32.1917 32.0891 14.3684
No log 22.0 66 0.8406 33.9712 30.7072 34.382 34.3829 14.3158
No log 23.0 69 0.8133 36.0319 32.7218 36.607 36.4543 14.3421
No log 24.0 72 0.7880 36.0319 32.7218 36.607 36.4543 14.3421
No log 25.0 75 0.7622 36.3979 33.086 36.905 36.6806 14.3421
No log 26.0 78 0.7377 40.1654 37.0379 40.2783 40.2438 14.2632
No log 27.0 81 0.7145 40.7528 37.7009 41.0545 41.0725 14.1316
No log 28.0 84 0.6912 40.7528 37.7009 41.0545 41.0725 14.1316
No log 29.0 87 0.6674 42.3738 39.5725 42.4529 42.3105 13.9737
No log 30.0 90 0.6429 44.7342 41.9521 44.7141 44.7653 14.1842
No log 31.0 93 0.6225 44.7342 41.9521 44.7141 44.7653 14.1842
No log 32.0 96 0.6045 44.7342 41.9521 44.7141 44.7653 14.9737
No log 33.0 99 0.5874 44.8851 42.3601 44.6841 44.7448 14.9474
No log 34.0 102 0.5707 48.0171 44.6572 48.3977 48.1823 14.3947
No log 35.0 105 0.5529 50.0598 46.834 50.0339 49.9161 14.4474
No log 36.0 108 0.5356 52.9499 49.369 53.0648 52.7644 14.5
No log 37.0 111 0.5203 52.8057 49.1915 52.9703 52.6609 14.3947
No log 38.0 114 0.5058 58.1928 55.6897 58.5269 58.4782 14.2105
No log 39.0 117 0.4921 60.7074 58.7889 60.8191 60.8808 13.7895
No log 40.0 120 0.4800 61.7875 59.9339 61.8496 61.7658 13.6842
No log 41.0 123 0.4698 61.7875 59.9339 61.8496 61.7658 13.6842
No log 42.0 126 0.4597 62.4637 60.7042 62.487 62.5551 13.5789
No log 43.0 129 0.4505 63.0021 61.3266 62.9796 63.0653 13.3158
No log 44.0 132 0.4442 63.6533 62.0722 63.7451 63.7313 13.1316
No log 45.0 135 0.4388 63.6533 62.0722 63.7451 63.7313 13.1842
No log 46.0 138 0.4316 63.6533 62.0722 63.7451 63.7313 13.1842
No log 47.0 141 0.4253 64.5396 63.1662 64.6941 64.7424 13.1053
No log 48.0 144 0.4194 65.9713 63.1662 66.2938 66.3228 13.1053
No log 49.0 147 0.4134 69.236 66.507 69.4403 69.4443 12.7368
No log 50.0 150 0.4078 69.9113 67.1987 70.0511 70.203 12.6053
No log 51.0 153 0.4037 69.9113 67.1987 70.0511 70.203 12.6053
No log 52.0 156 0.4001 69.9113 67.1987 70.0511 70.203 12.6053
No log 53.0 159 0.3967 71.6949 69.5145 72.1298 71.9241 12.5
No log 54.0 162 0.3933 71.6949 69.5145 72.1298 71.9241 12.5526
No log 55.0 165 0.3901 71.6949 69.5145 72.1298 71.9241 12.3684
No log 56.0 168 0.3875 71.6949 69.5145 72.1298 71.9241 12.3684
No log 57.0 171 0.3856 76.5735 74.0611 76.9279 76.7502 12.3684
No log 58.0 174 0.3843 76.5735 74.0611 76.9279 76.7502 12.3684
No log 59.0 177 0.3828 76.5735 74.0611 76.9279 76.7502 12.3684
No log 60.0 180 0.3811 76.5735 74.0611 76.9279 76.7502 12.3684
No log 61.0 183 0.3798 76.5735 74.0611 76.9279 76.7502 12.3684
No log 62.0 186 0.3786 76.5735 74.0611 76.9279 76.7502 12.3684
No log 63.0 189 0.3777 76.5735 74.0611 76.9279 76.7502 12.3684
No log 64.0 192 0.3773 76.5735 74.0611 76.9279 76.7502 12.3684

Framework versions

  • Transformers 4.32.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3