longt5_xl_sfd_bp_20 / README.md
learn3r's picture
Model save
ff95437
|
raw
history blame
3.63 kB
metadata
license: apache-2.0
base_model: google/long-t5-tglobal-xl
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: longt5_xl_sfd_bp_20
    results: []

longt5_xl_sfd_bp_20

This model is a fine-tuned version of google/long-t5-tglobal-xl on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8927
  • Rouge1: 32.7788
  • Rouge2: 13.9352
  • Rougel: 22.5175
  • Rougelsum: 31.548
  • Gen Len: 488.5134

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.3973 0.97 14 1.9074 10.6164 2.4585 10.4856 9.8193 511.0
1.9188 1.95 28 1.7082 17.4258 4.2128 16.5213 15.8377 511.0
1.4297 2.99 43 1.5073 18.6504 5.4242 17.2648 17.0203 506.7745
1.2759 3.97 57 1.5032 22.11 7.544 19.7035 20.2813 497.8783
1.1421 4.94 71 1.5462 20.6049 6.7146 18.5084 19.0876 503.6024
0.9605 5.98 86 1.6233 22.6777 7.9362 18.7936 21.41 510.2730
0.8082 6.96 100 1.7575 26.5338 9.9474 20.3789 25.0767 511.0
0.664 8.0 115 1.7702 35.1918 13.7223 26.1763 33.3997 329.7151
0.5471 8.97 129 1.9383 27.0414 10.4166 20.1803 25.6283 506.8279
0.4349 9.95 143 1.9608 29.5613 11.7633 22.7176 27.9563 454.7033
0.4338 10.99 158 2.1197 31.2004 12.8569 22.1282 29.8827 493.3234
0.2887 11.97 172 2.1205 34.9566 13.8574 25.1764 33.2914 381.3591
0.2753 12.94 186 2.4299 36.3877 13.8584 25.7829 34.8601 338.7240
0.2114 13.98 201 2.5799 39.7535 16.1209 27.8512 37.8553 302.4837
0.1805 14.96 215 2.6123 33.3254 13.0868 23.3214 31.7901 442.9258
0.1543 16.0 230 2.5635 31.7816 13.1085 22.9117 30.2286 463.0801
0.5166 16.97 244 2.5134 30.3969 12.1295 21.6616 28.7606 511.0
0.1117 17.95 258 2.8109 35.336 14.9492 24.1938 33.822 431.1157
0.0895 18.99 273 2.7577 41.0982 16.3935 28.1073 39.1641 240.1365
0.0779 19.48 280 2.8927 32.7788 13.9352 22.5175 31.548 488.5134

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1