metadata
license: apache-2.0
base_model: google/long-t5-tglobal-xl
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: longt5_xl_sfd_bp_20
results: []
longt5_xl_sfd_bp_20
This model is a fine-tuned version of google/long-t5-tglobal-xl on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.8927
- Rouge1: 32.7788
- Rouge2: 13.9352
- Rougel: 22.5175
- Rougelsum: 31.548
- Gen Len: 488.5134
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 20.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
2.3973 | 0.97 | 14 | 1.9074 | 10.6164 | 2.4585 | 10.4856 | 9.8193 | 511.0 |
1.9188 | 1.95 | 28 | 1.7082 | 17.4258 | 4.2128 | 16.5213 | 15.8377 | 511.0 |
1.4297 | 2.99 | 43 | 1.5073 | 18.6504 | 5.4242 | 17.2648 | 17.0203 | 506.7745 |
1.2759 | 3.97 | 57 | 1.5032 | 22.11 | 7.544 | 19.7035 | 20.2813 | 497.8783 |
1.1421 | 4.94 | 71 | 1.5462 | 20.6049 | 6.7146 | 18.5084 | 19.0876 | 503.6024 |
0.9605 | 5.98 | 86 | 1.6233 | 22.6777 | 7.9362 | 18.7936 | 21.41 | 510.2730 |
0.8082 | 6.96 | 100 | 1.7575 | 26.5338 | 9.9474 | 20.3789 | 25.0767 | 511.0 |
0.664 | 8.0 | 115 | 1.7702 | 35.1918 | 13.7223 | 26.1763 | 33.3997 | 329.7151 |
0.5471 | 8.97 | 129 | 1.9383 | 27.0414 | 10.4166 | 20.1803 | 25.6283 | 506.8279 |
0.4349 | 9.95 | 143 | 1.9608 | 29.5613 | 11.7633 | 22.7176 | 27.9563 | 454.7033 |
0.4338 | 10.99 | 158 | 2.1197 | 31.2004 | 12.8569 | 22.1282 | 29.8827 | 493.3234 |
0.2887 | 11.97 | 172 | 2.1205 | 34.9566 | 13.8574 | 25.1764 | 33.2914 | 381.3591 |
0.2753 | 12.94 | 186 | 2.4299 | 36.3877 | 13.8584 | 25.7829 | 34.8601 | 338.7240 |
0.2114 | 13.98 | 201 | 2.5799 | 39.7535 | 16.1209 | 27.8512 | 37.8553 | 302.4837 |
0.1805 | 14.96 | 215 | 2.6123 | 33.3254 | 13.0868 | 23.3214 | 31.7901 | 442.9258 |
0.1543 | 16.0 | 230 | 2.5635 | 31.7816 | 13.1085 | 22.9117 | 30.2286 | 463.0801 |
0.5166 | 16.97 | 244 | 2.5134 | 30.3969 | 12.1295 | 21.6616 | 28.7606 | 511.0 |
0.1117 | 17.95 | 258 | 2.8109 | 35.336 | 14.9492 | 24.1938 | 33.822 | 431.1157 |
0.0895 | 18.99 | 273 | 2.7577 | 41.0982 | 16.3935 | 28.1073 | 39.1641 | 240.1365 |
0.0779 | 19.48 | 280 | 2.8927 | 32.7788 | 13.9352 | 22.5175 | 31.548 | 488.5134 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0+cu121
- Datasets 2.14.5
- Tokenizers 0.14.1