--- license: apache-2.0 base_model: google/long-t5-tglobal-xl tags: - generated_from_trainer metrics: - rouge model-index: - name: longt5_xl_sfd_bp_20 results: [] --- # longt5_xl_sfd_bp_20 This model is a fine-tuned version of [google/long-t5-tglobal-xl](https://huggingface.co/google/long-t5-tglobal-xl) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 2.8927 - Rouge1: 32.7788 - Rouge2: 13.9352 - Rougel: 22.5175 - Rougelsum: 31.548 - Gen Len: 488.5134 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.001 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 32 - total_train_batch_size: 256 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - num_epochs: 20.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:| | 2.3973 | 0.97 | 14 | 1.9074 | 10.6164 | 2.4585 | 10.4856 | 9.8193 | 511.0 | | 1.9188 | 1.95 | 28 | 1.7082 | 17.4258 | 4.2128 | 16.5213 | 15.8377 | 511.0 | | 1.4297 | 2.99 | 43 | 1.5073 | 18.6504 | 5.4242 | 17.2648 | 17.0203 | 506.7745 | | 1.2759 | 3.97 | 57 | 1.5032 | 22.11 | 7.544 | 19.7035 | 20.2813 | 497.8783 | | 1.1421 | 4.94 | 71 | 1.5462 | 20.6049 | 6.7146 | 18.5084 | 19.0876 | 503.6024 | | 0.9605 | 5.98 | 86 | 1.6233 | 22.6777 | 7.9362 | 18.7936 | 21.41 | 510.2730 | | 0.8082 | 6.96 | 100 | 1.7575 | 26.5338 | 9.9474 | 20.3789 | 25.0767 | 511.0 | | 0.664 | 8.0 | 115 | 1.7702 | 35.1918 | 13.7223 | 26.1763 | 33.3997 | 329.7151 | | 0.5471 | 8.97 | 129 | 1.9383 | 27.0414 | 10.4166 | 20.1803 | 25.6283 | 506.8279 | | 0.4349 | 9.95 | 143 | 1.9608 | 29.5613 | 11.7633 | 22.7176 | 27.9563 | 454.7033 | | 0.4338 | 10.99 | 158 | 2.1197 | 31.2004 | 12.8569 | 22.1282 | 29.8827 | 493.3234 | | 0.2887 | 11.97 | 172 | 2.1205 | 34.9566 | 13.8574 | 25.1764 | 33.2914 | 381.3591 | | 0.2753 | 12.94 | 186 | 2.4299 | 36.3877 | 13.8584 | 25.7829 | 34.8601 | 338.7240 | | 0.2114 | 13.98 | 201 | 2.5799 | 39.7535 | 16.1209 | 27.8512 | 37.8553 | 302.4837 | | 0.1805 | 14.96 | 215 | 2.6123 | 33.3254 | 13.0868 | 23.3214 | 31.7901 | 442.9258 | | 0.1543 | 16.0 | 230 | 2.5635 | 31.7816 | 13.1085 | 22.9117 | 30.2286 | 463.0801 | | 0.5166 | 16.97 | 244 | 2.5134 | 30.3969 | 12.1295 | 21.6616 | 28.7606 | 511.0 | | 0.1117 | 17.95 | 258 | 2.8109 | 35.336 | 14.9492 | 24.1938 | 33.822 | 431.1157 | | 0.0895 | 18.99 | 273 | 2.7577 | 41.0982 | 16.3935 | 28.1073 | 39.1641 | 240.1365 | | 0.0779 | 19.48 | 280 | 2.8927 | 32.7788 | 13.9352 | 22.5175 | 31.548 | 488.5134 | ### Framework versions - Transformers 4.34.1 - Pytorch 2.1.0+cu121 - Datasets 2.14.5 - Tokenizers 0.14.1