BlueBeagle's picture
End of training
7acdb1c
|
raw
history blame
27.8 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: t5-small-finetuned-xsum
    results: []

t5-small-finetuned-xsum

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0614
  • Rouge1: 100.0
  • Rouge2: 91.3225
  • Rougel: 93.8251
  • Rougelsum: 100.0
  • Gen Len: 13.6957

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 256

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 3 2.3373 7.4605 3.456 7.4165 7.3515 17.3478
No log 2.0 6 2.2506 8.0394 3.456 7.9941 7.9333 17.3478
No log 3.0 9 2.1686 9.8685 4.4912 9.5531 9.5659 17.2826
No log 4.0 12 1.9320 11.9877 5.571 11.6126 11.6283 17.4565
No log 5.0 15 1.7756 13.4114 6.9661 13.2697 13.142 17.3043
No log 6.0 18 1.5497 20.8289 15.6415 20.801 20.8735 17.2826
No log 7.0 21 1.3883 23.7381 18.9965 23.9623 23.8759 17.1522
No log 8.0 24 1.2540 29.9925 24.6624 30.2144 30.203 17.2174
No log 9.0 27 1.1418 32.2608 26.7882 32.3905 32.3128 16.8261
No log 10.0 30 1.0445 33.2161 26.6606 33.4993 33.5427 16.1957
No log 11.0 33 0.9713 40.9827 34.1302 41.241 41.1489 15.3696
No log 12.0 36 0.9212 38.2575 32.6764 38.6921 38.4431 15.3043
No log 13.0 39 0.8729 36.6868 31.6724 36.7527 36.4661 15.5
No log 14.0 42 0.8234 43.2153 38.3637 43.3746 43.5416 15.5652
No log 15.0 45 0.7772 47.0778 41.5839 47.345 47.5476 15.1739
No log 16.0 48 0.7361 52.0592 46.9257 52.4456 52.359 15.1087
No log 17.0 51 0.6846 56.6602 51.1202 56.7449 56.6945 15.087
No log 18.0 54 0.6385 63.4898 57.2655 63.3536 63.5142 14.7826
No log 19.0 57 0.5898 63.7253 57.5173 63.6094 63.8015 14.8043
No log 20.0 60 0.5366 64.165 57.892 63.9786 64.2446 14.8043
No log 21.0 63 0.4835 67.1158 60.4042 67.3224 67.4436 14.7826
No log 22.0 66 0.4322 72.722 66.3987 72.7989 72.8632 14.3696
No log 23.0 69 0.3795 77.3032 70.8913 77.3924 77.2499 13.7826
No log 24.0 72 0.3276 83.8189 78.7629 83.662 83.6469 13.1522
No log 25.0 75 0.2944 84.1064 78.8184 83.9576 83.9749 12.8913
No log 26.0 78 0.2606 87.7806 83.499 87.6869 87.8716 12.7826
No log 27.0 81 0.2257 89.5296 85.4444 89.4879 89.5489 12.913
No log 28.0 84 0.1899 91.3258 87.7915 91.2211 91.4052 13.0435
No log 29.0 87 0.1663 91.5209 88.0013 91.3906 91.6698 12.9565
No log 30.0 90 0.1448 91.4444 87.8821 91.221 91.4855 13.0652
No log 31.0 93 0.1303 91.6605 88.2289 91.5591 91.7967 13.0652
No log 32.0 96 0.1179 93.1229 88.0952 92.4854 93.1228 13.1522
No log 33.0 99 0.1025 92.2473 86.9145 91.9109 92.2922 12.8696
No log 34.0 102 0.0927 92.2473 86.9145 91.9109 92.2922 12.8696
No log 35.0 105 0.0858 94.4127 88.6778 93.3355 94.3427 13.2174
No log 36.0 108 0.0777 95.1449 89.4928 94.4928 95.2174 13.1739
No log 37.0 111 0.0698 95.1449 89.4928 94.4928 95.2174 13.1739
No log 38.0 114 0.0616 95.1449 89.4928 94.4928 95.2174 13.1739
No log 39.0 117 0.0496 95.1449 89.4928 94.4928 95.2174 13.1739
No log 40.0 120 0.0431 93.7681 88.0435 93.7681 93.8406 13.087
No log 41.0 123 0.0414 95.1449 90.2174 94.9275 95.2174 13.1739
No log 42.0 126 0.0393 95.1449 90.2174 94.9275 95.2174 13.1739
No log 43.0 129 0.0370 95.1449 90.2174 94.9275 95.2174 13.1739
No log 44.0 132 0.0329 96.1836 91.4596 96.1353 96.3043 13.2826
No log 45.0 135 0.0304 96.6184 92.5466 96.6184 96.7391 13.3478
No log 46.0 138 0.0294 96.6184 92.5466 96.6184 96.7391 13.3478
No log 47.0 141 0.0292 96.7391 93.1159 96.4803 96.9203 13.3913
No log 48.0 144 0.0290 96.7391 92.7536 96.0663 96.9203 13.3913
No log 49.0 147 0.0290 98.913 97.2826 98.323 98.913 13.587
No log 50.0 150 0.0299 98.913 97.2826 98.323 98.913 13.587
No log 51.0 153 0.0310 97.4638 93.8768 96.0145 97.4638 13.4565
No log 52.0 156 0.0320 98.913 95.1449 96.6097 98.913 13.587
No log 53.0 159 0.0341 97.4638 91.413 94.001 97.4638 13.4565
No log 54.0 162 0.0364 97.4638 91.0326 93.6465 97.4638 13.4565
No log 55.0 165 0.0377 97.4638 91.0326 93.6465 97.4638 13.4565
No log 56.0 168 0.0389 98.913 93.2246 95.0311 98.913 13.587
No log 57.0 171 0.0399 98.913 93.2246 95.0311 98.913 13.587
No log 58.0 174 0.0402 98.913 93.2246 95.0311 98.913 13.587
No log 59.0 177 0.0412 98.913 93.2246 95.0311 98.913 13.587
No log 60.0 180 0.0430 98.913 93.2246 95.0311 98.913 13.587
No log 61.0 183 0.0454 98.913 92.6268 94.5575 98.913 13.587
No log 62.0 186 0.0471 98.913 92.192 94.146 98.913 13.587
No log 63.0 189 0.0477 98.913 92.192 94.146 98.913 13.587
No log 64.0 192 0.0481 98.913 92.192 94.146 98.913 13.587
No log 65.0 195 0.0496 98.913 92.192 94.146 98.913 13.587
No log 66.0 198 0.0512 98.913 92.192 94.146 98.913 13.587
No log 67.0 201 0.0530 98.913 92.192 94.146 98.913 13.587
No log 68.0 204 0.0551 98.913 92.192 94.146 98.913 13.587
No log 69.0 207 0.0567 98.913 92.192 94.146 98.913 13.587
No log 70.0 210 0.0577 98.913 92.192 94.146 98.913 13.587
No log 71.0 213 0.0590 98.913 92.192 94.146 98.913 13.587
No log 72.0 216 0.0600 98.913 92.192 94.146 98.913 13.587
No log 73.0 219 0.0611 100.0 93.1159 94.5367 100.0 13.6957
No log 74.0 222 0.0615 100.0 93.1159 94.5367 100.0 13.6957
No log 75.0 225 0.0614 100.0 93.1159 94.5367 100.0 13.6957
No log 76.0 228 0.0601 100.0 93.1159 94.5367 100.0 13.6957
No log 77.0 231 0.0594 100.0 93.1159 94.5367 100.0 13.6957
No log 78.0 234 0.0595 100.0 93.1159 94.5367 100.0 13.6957
No log 79.0 237 0.0597 100.0 93.1159 94.5367 100.0 13.6957
No log 80.0 240 0.0607 100.0 93.1159 94.5367 100.0 13.6957
No log 81.0 243 0.0615 100.0 93.1159 94.5367 100.0 13.6957
No log 82.0 246 0.0620 100.0 93.1159 94.5367 100.0 13.6957
No log 83.0 249 0.0619 100.0 93.1159 94.5367 100.0 13.6957
No log 84.0 252 0.0615 100.0 93.1159 94.5367 100.0 13.6957
No log 85.0 255 0.0619 100.0 93.1159 94.5367 100.0 13.6957
No log 86.0 258 0.0620 100.0 93.1159 94.5367 100.0 13.6957
No log 87.0 261 0.0622 100.0 93.1159 94.5367 100.0 13.6957
No log 88.0 264 0.0630 100.0 93.1159 94.5367 100.0 13.6957
No log 89.0 267 0.0632 100.0 93.1159 94.5367 100.0 13.6957
No log 90.0 270 0.0631 100.0 93.1159 94.5367 100.0 13.6957
No log 91.0 273 0.0635 100.0 93.1159 94.5367 100.0 13.6957
No log 92.0 276 0.0637 100.0 90.9058 93.4886 100.0 13.6957
No log 93.0 279 0.0634 100.0 90.9058 93.4886 100.0 13.6957
No log 94.0 282 0.0635 100.0 90.9058 93.4886 100.0 13.6957
No log 95.0 285 0.0623 100.0 90.9058 93.4886 100.0 13.6957
No log 96.0 288 0.0607 100.0 90.9058 93.4886 100.0 13.6957
No log 97.0 291 0.0594 100.0 90.9058 93.4886 100.0 13.6957
No log 98.0 294 0.0595 100.0 90.9058 93.4886 100.0 13.6957
No log 99.0 297 0.0594 100.0 90.9058 93.4886 100.0 13.6957
No log 100.0 300 0.0601 100.0 90.9058 93.4886 100.0 13.6957
No log 101.0 303 0.0617 100.0 90.9058 93.4886 100.0 13.6957
No log 102.0 306 0.0630 100.0 90.9058 93.4886 100.0 13.6957
No log 103.0 309 0.0638 100.0 90.9058 93.4886 100.0 13.6957
No log 104.0 312 0.0650 100.0 90.9058 93.4886 100.0 13.6957
No log 105.0 315 0.0658 100.0 90.9058 93.4886 100.0 13.6957
No log 106.0 318 0.0655 100.0 90.9058 93.4886 100.0 13.6957
No log 107.0 321 0.0647 100.0 90.9058 93.4886 100.0 13.6957
No log 108.0 324 0.0632 100.0 90.9058 93.4886 100.0 13.6957
No log 109.0 327 0.0618 100.0 90.9058 93.4886 100.0 13.6957
No log 110.0 330 0.0615 100.0 90.9058 93.4886 100.0 13.6957
No log 111.0 333 0.0615 100.0 90.9058 93.4886 100.0 13.6957
No log 112.0 336 0.0616 100.0 90.9058 93.4886 100.0 13.6957
No log 113.0 339 0.0611 100.0 90.9058 93.4886 100.0 13.6957
No log 114.0 342 0.0618 100.0 90.9058 93.4886 100.0 13.6957
No log 115.0 345 0.0625 100.0 90.9058 93.4886 100.0 13.6957
No log 116.0 348 0.0626 100.0 90.9058 93.4886 100.0 13.6957
No log 117.0 351 0.0619 100.0 90.9058 93.4886 100.0 13.6957
No log 118.0 354 0.0611 100.0 91.3225 93.8251 100.0 13.6957
No log 119.0 357 0.0598 100.0 91.3225 93.8251 100.0 13.6957
No log 120.0 360 0.0585 100.0 91.3225 93.8251 100.0 13.6957
No log 121.0 363 0.0574 100.0 91.3225 93.8251 100.0 13.6957
No log 122.0 366 0.0572 100.0 91.3225 93.8251 100.0 13.6957
No log 123.0 369 0.0575 100.0 91.3225 93.8251 100.0 13.6957
No log 124.0 372 0.0582 100.0 91.3225 93.8251 100.0 13.6957
No log 125.0 375 0.0588 100.0 91.3225 93.8251 100.0 13.6957
No log 126.0 378 0.0597 100.0 91.3225 93.8251 100.0 13.6957
No log 127.0 381 0.0605 100.0 91.3225 93.8251 100.0 13.6957
No log 128.0 384 0.0611 100.0 91.3225 93.8251 100.0 13.6957
No log 129.0 387 0.0622 100.0 91.3225 93.8251 100.0 13.6957
No log 130.0 390 0.0634 100.0 91.3225 93.8251 100.0 13.6957
No log 131.0 393 0.0640 100.0 91.3225 93.8251 100.0 13.6957
No log 132.0 396 0.0640 100.0 91.3225 93.8251 100.0 13.6957
No log 133.0 399 0.0632 100.0 91.3225 93.8251 100.0 13.6957
No log 134.0 402 0.0622 100.0 91.3225 93.8251 100.0 13.6957
No log 135.0 405 0.0608 100.0 91.3225 93.8251 100.0 13.6957
No log 136.0 408 0.0595 100.0 91.3225 93.8251 100.0 13.6957
No log 137.0 411 0.0588 100.0 91.3225 93.8251 100.0 13.6957
No log 138.0 414 0.0585 100.0 91.3225 93.8251 100.0 13.6957
No log 139.0 417 0.0584 100.0 91.3225 93.8251 100.0 13.6957
No log 140.0 420 0.0580 100.0 91.3225 93.8251 100.0 13.6957
No log 141.0 423 0.0578 100.0 91.3225 93.8251 100.0 13.6957
No log 142.0 426 0.0583 100.0 91.3225 93.8251 100.0 13.6957
No log 143.0 429 0.0584 100.0 91.3225 93.8251 100.0 13.6957
No log 144.0 432 0.0583 100.0 91.3225 93.8251 100.0 13.6957
No log 145.0 435 0.0578 100.0 91.3225 93.8251 100.0 13.6957
No log 146.0 438 0.0579 100.0 91.3225 93.8251 100.0 13.6957
No log 147.0 441 0.0578 100.0 91.3225 93.8251 100.0 13.6957
No log 148.0 444 0.0579 100.0 91.3225 93.8251 100.0 13.6957
No log 149.0 447 0.0583 100.0 91.3225 93.8251 100.0 13.6957
No log 150.0 450 0.0589 100.0 91.3225 93.8251 100.0 13.6957
No log 151.0 453 0.0599 100.0 91.3225 93.8251 100.0 13.6957
No log 152.0 456 0.0603 100.0 91.3225 93.8251 100.0 13.6957
No log 153.0 459 0.0608 100.0 91.3225 93.8251 100.0 13.6957
No log 154.0 462 0.0611 100.0 91.3225 93.8251 100.0 13.6957
No log 155.0 465 0.0614 100.0 91.3225 93.8251 100.0 13.6957
No log 156.0 468 0.0613 100.0 91.3225 93.8251 100.0 13.6957
No log 157.0 471 0.0611 100.0 91.3225 93.8251 100.0 13.6957
No log 158.0 474 0.0608 100.0 91.3225 93.8251 100.0 13.6957
No log 159.0 477 0.0605 100.0 91.3225 93.8251 100.0 13.6957
No log 160.0 480 0.0598 100.0 91.3225 93.8251 100.0 13.6957
No log 161.0 483 0.0594 100.0 91.3225 93.8251 100.0 13.6957
No log 162.0 486 0.0593 100.0 91.3225 93.8251 100.0 13.6957
No log 163.0 489 0.0588 100.0 91.3225 93.8251 100.0 13.6957
No log 164.0 492 0.0585 100.0 91.3225 93.8251 100.0 13.6957
No log 165.0 495 0.0579 100.0 91.3225 93.8251 100.0 13.6957
No log 166.0 498 0.0570 100.0 91.3225 93.8251 100.0 13.6957
0.4684 167.0 501 0.0563 100.0 91.3225 93.8251 100.0 13.6957
0.4684 168.0 504 0.0560 100.0 91.3225 93.8251 100.0 13.6957
0.4684 169.0 507 0.0560 100.0 91.3225 93.8251 100.0 13.6957
0.4684 170.0 510 0.0562 100.0 91.3225 93.8251 100.0 13.6957
0.4684 171.0 513 0.0563 100.0 91.3225 93.8251 100.0 13.6957
0.4684 172.0 516 0.0565 100.0 91.3225 93.8251 100.0 13.6957
0.4684 173.0 519 0.0568 100.0 91.3225 93.8251 100.0 13.6957
0.4684 174.0 522 0.0576 100.0 91.3225 93.8251 100.0 13.6957
0.4684 175.0 525 0.0583 100.0 91.3225 93.8251 100.0 13.6957
0.4684 176.0 528 0.0586 100.0 91.3225 93.8251 100.0 13.6957
0.4684 177.0 531 0.0584 100.0 91.3225 93.8251 100.0 13.6957
0.4684 178.0 534 0.0579 100.0 91.3225 93.8251 100.0 13.6957
0.4684 179.0 537 0.0575 100.0 91.3225 93.8251 100.0 13.6957
0.4684 180.0 540 0.0576 100.0 91.3225 93.8251 100.0 13.6957
0.4684 181.0 543 0.0578 100.0 91.3225 93.8251 100.0 13.6957
0.4684 182.0 546 0.0576 100.0 91.3225 93.8251 100.0 13.6957
0.4684 183.0 549 0.0575 100.0 91.3225 93.8251 100.0 13.6957
0.4684 184.0 552 0.0577 100.0 91.3225 93.8251 100.0 13.6957
0.4684 185.0 555 0.0577 100.0 91.3225 93.8251 100.0 13.6957
0.4684 186.0 558 0.0576 100.0 91.3225 93.8251 100.0 13.6957
0.4684 187.0 561 0.0575 100.0 91.3225 93.8251 100.0 13.6957
0.4684 188.0 564 0.0573 100.0 91.3225 93.8251 100.0 13.6957
0.4684 189.0 567 0.0571 100.0 91.3225 93.8251 100.0 13.6957
0.4684 190.0 570 0.0570 100.0 91.3225 93.8251 100.0 13.6957
0.4684 191.0 573 0.0567 100.0 91.3225 93.8251 100.0 13.6957
0.4684 192.0 576 0.0564 100.0 91.3225 93.8251 100.0 13.6957
0.4684 193.0 579 0.0562 100.0 91.3225 93.8251 100.0 13.6957
0.4684 194.0 582 0.0562 100.0 91.9203 94.2702 100.0 13.6957
0.4684 195.0 585 0.0565 100.0 91.9203 94.2702 100.0 13.6957
0.4684 196.0 588 0.0563 100.0 91.9203 94.2702 100.0 13.6957
0.4684 197.0 591 0.0560 100.0 91.9203 94.2702 100.0 13.6957
0.4684 198.0 594 0.0558 100.0 91.9203 94.2702 100.0 13.6957
0.4684 199.0 597 0.0559 100.0 91.9203 94.2702 100.0 13.6957
0.4684 200.0 600 0.0562 100.0 91.9203 94.2702 100.0 13.6957
0.4684 201.0 603 0.0568 100.0 91.9203 94.2702 100.0 13.6957
0.4684 202.0 606 0.0572 100.0 91.9203 94.2702 100.0 13.6957
0.4684 203.0 609 0.0575 100.0 91.9203 94.2702 100.0 13.6957
0.4684 204.0 612 0.0577 100.0 91.9203 94.2702 100.0 13.6957
0.4684 205.0 615 0.0580 100.0 91.9203 94.2702 100.0 13.6957
0.4684 206.0 618 0.0580 100.0 91.9203 94.2702 100.0 13.6957
0.4684 207.0 621 0.0580 100.0 91.9203 94.2702 100.0 13.6957
0.4684 208.0 624 0.0577 100.0 91.9203 94.2702 100.0 13.6957
0.4684 209.0 627 0.0577 100.0 91.9203 94.2702 100.0 13.6957
0.4684 210.0 630 0.0576 100.0 91.9203 94.2702 100.0 13.6957
0.4684 211.0 633 0.0573 100.0 91.9203 94.2702 100.0 13.6957
0.4684 212.0 636 0.0571 100.0 91.9203 94.2702 100.0 13.6957
0.4684 213.0 639 0.0571 100.0 91.9203 94.2702 100.0 13.6957
0.4684 214.0 642 0.0573 100.0 91.9203 94.2702 100.0 13.6957
0.4684 215.0 645 0.0574 100.0 91.9203 94.2702 100.0 13.6957
0.4684 216.0 648 0.0579 100.0 91.9203 94.2702 100.0 13.6957
0.4684 217.0 651 0.0584 100.0 91.9203 94.2702 100.0 13.6957
0.4684 218.0 654 0.0588 100.0 91.3225 93.8251 100.0 13.6957
0.4684 219.0 657 0.0591 100.0 91.3225 93.8251 100.0 13.6957
0.4684 220.0 660 0.0593 100.0 91.3225 93.8251 100.0 13.6957
0.4684 221.0 663 0.0594 100.0 91.3225 93.8251 100.0 13.6957
0.4684 222.0 666 0.0595 100.0 91.3225 93.8251 100.0 13.6957
0.4684 223.0 669 0.0595 100.0 91.3225 93.8251 100.0 13.6957
0.4684 224.0 672 0.0596 100.0 91.3225 93.8251 100.0 13.6957
0.4684 225.0 675 0.0596 100.0 91.3225 93.8251 100.0 13.6957
0.4684 226.0 678 0.0596 100.0 91.3225 93.8251 100.0 13.6957
0.4684 227.0 681 0.0597 100.0 91.3225 93.8251 100.0 13.6957
0.4684 228.0 684 0.0599 100.0 91.3225 93.8251 100.0 13.6957
0.4684 229.0 687 0.0601 100.0 91.3225 93.8251 100.0 13.6957
0.4684 230.0 690 0.0605 100.0 91.3225 93.8251 100.0 13.6957
0.4684 231.0 693 0.0609 100.0 91.3225 93.8251 100.0 13.6957
0.4684 232.0 696 0.0611 100.0 91.3225 93.8251 100.0 13.6957
0.4684 233.0 699 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 234.0 702 0.0615 100.0 91.3225 93.8251 100.0 13.6957
0.4684 235.0 705 0.0616 100.0 91.3225 93.8251 100.0 13.6957
0.4684 236.0 708 0.0619 100.0 91.3225 93.8251 100.0 13.6957
0.4684 237.0 711 0.0620 100.0 91.3225 93.8251 100.0 13.6957
0.4684 238.0 714 0.0621 100.0 91.3225 93.8251 100.0 13.6957
0.4684 239.0 717 0.0622 100.0 91.3225 93.8251 100.0 13.6957
0.4684 240.0 720 0.0622 100.0 91.3225 93.8251 100.0 13.6957
0.4684 241.0 723 0.0621 100.0 91.3225 93.8251 100.0 13.6957
0.4684 242.0 726 0.0620 100.0 91.3225 93.8251 100.0 13.6957
0.4684 243.0 729 0.0618 100.0 91.3225 93.8251 100.0 13.6957
0.4684 244.0 732 0.0616 100.0 91.3225 93.8251 100.0 13.6957
0.4684 245.0 735 0.0615 100.0 91.3225 93.8251 100.0 13.6957
0.4684 246.0 738 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 247.0 741 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 248.0 744 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 249.0 747 0.0615 100.0 91.3225 93.8251 100.0 13.6957
0.4684 250.0 750 0.0615 100.0 91.3225 93.8251 100.0 13.6957
0.4684 251.0 753 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 252.0 756 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 253.0 759 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 254.0 762 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 255.0 765 0.0614 100.0 91.3225 93.8251 100.0 13.6957
0.4684 256.0 768 0.0614 100.0 91.3225 93.8251 100.0 13.6957

Framework versions

  • Transformers 4.32.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4
  • Tokenizers 0.13.3