metadata
base_model: hiba2/results_arat5-2_wiki
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: results_arat5-3_wiki
results: []
results_arat5-3_wiki
This model is a fine-tuned version of hiba2/results_arat5-2_wiki on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.7821
- Rouge1: 0.0926
- Rouge2: 0.0015
- Rougel: 0.0934
- Rougelsum: 0.0928
- Gen Len: 19.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|---|
7.8936 | 0.9506 | 500 | 0.0 | 5.9107 | 0.0 | 0.0 | 0.0 | 0.0 |
5.5649 | 1.9011 | 1000 | 18.8876 | 4.9336 | 0.0905 | 0.0 | 0.0915 | 0.0912 |
4.9098 | 2.8517 | 1500 | 18.8989 | 4.3731 | 0.0905 | 0.0 | 0.0915 | 0.0912 |
4.4486 | 3.8023 | 2000 | 19.0 | 3.9340 | 0.0875 | 0.0 | 0.0885 | 0.0882 |
4.0755 | 4.7529 | 2500 | 18.9382 | 3.5412 | 0.0881 | 0.0 | 0.0891 | 0.0887 |
3.6998 | 5.7034 | 3000 | 18.8783 | 3.1344 | 0.095 | 0.0002 | 0.0958 | 0.0954 |
3.3129 | 6.6540 | 3500 | 18.8408 | 2.8528 | 0.0935 | 0.0013 | 0.0945 | 0.094 |
3.1053 | 7.6046 | 4000 | 18.9382 | 2.6196 | 0.0936 | 0.0008 | 0.0946 | 0.0941 |
2.8412 | 8.5551 | 4500 | 18.867 | 2.4414 | 0.091 | 0.0011 | 0.0919 | 0.0915 |
2.702 | 9.5057 | 5000 | 18.8783 | 2.2952 | 0.0936 | 0.001 | 0.0948 | 0.0946 |
2.5611 | 10.4563 | 5500 | 19.0 | 2.1816 | 0.093 | 0.0011 | 0.0941 | 0.0936 |
2.4499 | 11.4068 | 6000 | 18.8502 | 2.0914 | 0.0988 | 0.0011 | 0.0995 | 0.099 |
2.3764 | 12.3574 | 6500 | 18.8371 | 2.0264 | 0.0992 | 0.0016 | 0.0997 | 0.0995 |
2.3172 | 13.3080 | 7000 | 18.9888 | 1.9853 | 0.098 | 0.0015 | 0.099 | 0.0986 |
2.2794 | 14.2586 | 7500 | 18.9888 | 1.9615 | 0.0971 | 0.0023 | 0.0977 | 0.0976 |
2.2178 | 15.2091 | 8000 | 1.9424 | 0.0961 | 0.0009 | 0.0972 | 0.0968 | 19.0 |
2.2378 | 16.1597 | 8500 | 1.8855 | 0.0935 | 0.0011 | 0.0942 | 0.0937 | 19.0 |
2.1573 | 17.1103 | 9000 | 1.8386 | 0.0952 | 0.0009 | 0.0962 | 0.0958 | 19.0 |
2.132 | 18.0608 | 9500 | 1.8055 | 0.0919 | 0.0012 | 0.0929 | 0.0923 | 18.8783 |
2.1035 | 19.0114 | 10000 | 1.7863 | 0.0942 | 0.0015 | 0.0949 | 0.0945 | 19.0 |
2.0818 | 19.9620 | 10500 | 1.7821 | 0.0926 | 0.0015 | 0.0934 | 0.0928 | 19.0 |
Framework versions
- Transformers 4.42.0.dev0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1