--- base_model: hiba2/results_arat5-2_wiki tags: - generated_from_trainer metrics: - rouge model-index: - name: results_arat5-3_wiki results: [] --- # results_arat5-3_wiki This model is a fine-tuned version of [hiba2/results_arat5-2_wiki](https://huggingface.co/hiba2/results_arat5-2_wiki) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.7821 - Rouge1: 0.0926 - Rouge2: 0.0015 - Rougel: 0.0934 - Rougelsum: 0.0928 - Gen Len: 19.0 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 20 ### Training results | Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |:-------------:|:-------:|:-----:|:-------:|:---------------:|:------:|:------:|:------:|:---------:| | 7.8936 | 0.9506 | 500 | 0.0 | 5.9107 | 0.0 | 0.0 | 0.0 | 0.0 | | 5.5649 | 1.9011 | 1000 | 18.8876 | 4.9336 | 0.0905 | 0.0 | 0.0915 | 0.0912 | | 4.9098 | 2.8517 | 1500 | 18.8989 | 4.3731 | 0.0905 | 0.0 | 0.0915 | 0.0912 | | 4.4486 | 3.8023 | 2000 | 19.0 | 3.9340 | 0.0875 | 0.0 | 0.0885 | 0.0882 | | 4.0755 | 4.7529 | 2500 | 18.9382 | 3.5412 | 0.0881 | 0.0 | 0.0891 | 0.0887 | | 3.6998 | 5.7034 | 3000 | 18.8783 | 3.1344 | 0.095 | 0.0002 | 0.0958 | 0.0954 | | 3.3129 | 6.6540 | 3500 | 18.8408 | 2.8528 | 0.0935 | 0.0013 | 0.0945 | 0.094 | | 3.1053 | 7.6046 | 4000 | 18.9382 | 2.6196 | 0.0936 | 0.0008 | 0.0946 | 0.0941 | | 2.8412 | 8.5551 | 4500 | 18.867 | 2.4414 | 0.091 | 0.0011 | 0.0919 | 0.0915 | | 2.702 | 9.5057 | 5000 | 18.8783 | 2.2952 | 0.0936 | 0.001 | 0.0948 | 0.0946 | | 2.5611 | 10.4563 | 5500 | 19.0 | 2.1816 | 0.093 | 0.0011 | 0.0941 | 0.0936 | | 2.4499 | 11.4068 | 6000 | 18.8502 | 2.0914 | 0.0988 | 0.0011 | 0.0995 | 0.099 | | 2.3764 | 12.3574 | 6500 | 18.8371 | 2.0264 | 0.0992 | 0.0016 | 0.0997 | 0.0995 | | 2.3172 | 13.3080 | 7000 | 18.9888 | 1.9853 | 0.098 | 0.0015 | 0.099 | 0.0986 | | 2.2794 | 14.2586 | 7500 | 18.9888 | 1.9615 | 0.0971 | 0.0023 | 0.0977 | 0.0976 | | 2.2178 | 15.2091 | 8000 | 1.9424 | 0.0961 | 0.0009 | 0.0972 | 0.0968 | 19.0 | | 2.2378 | 16.1597 | 8500 | 1.8855 | 0.0935 | 0.0011 | 0.0942 | 0.0937 | 19.0 | | 2.1573 | 17.1103 | 9000 | 1.8386 | 0.0952 | 0.0009 | 0.0962 | 0.0958 | 19.0 | | 2.132 | 18.0608 | 9500 | 1.8055 | 0.0919 | 0.0012 | 0.0929 | 0.0923 | 18.8783 | | 2.1035 | 19.0114 | 10000 | 1.7863 | 0.0942 | 0.0015 | 0.0949 | 0.0945 | 19.0 | | 2.0818 | 19.9620 | 10500 | 1.7821 | 0.0926 | 0.0015 | 0.0934 | 0.0928 | 19.0 | ### Framework versions - Transformers 4.42.0.dev0 - Pytorch 2.3.0+cu121 - Datasets 2.19.1 - Tokenizers 0.19.1