|
--- |
|
base_model: hiba2/results_arat5-2_wiki |
|
tags: |
|
- generated_from_trainer |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: results_arat5-3_wiki |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# results_arat5-3_wiki |
|
|
|
This model is a fine-tuned version of [hiba2/results_arat5-2_wiki](https://huggingface.co/hiba2/results_arat5-2_wiki) on an unknown dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 1.7821 |
|
- Rouge1: 0.0926 |
|
- Rouge2: 0.0015 |
|
- Rougel: 0.0934 |
|
- Rougelsum: 0.0928 |
|
- Gen Len: 19.0 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 5e-05 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 4 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 20 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |
|
|:-------------:|:-------:|:-----:|:-------:|:---------------:|:------:|:------:|:------:|:---------:| |
|
| 7.8936 | 0.9506 | 500 | 0.0 | 5.9107 | 0.0 | 0.0 | 0.0 | 0.0 | |
|
| 5.5649 | 1.9011 | 1000 | 18.8876 | 4.9336 | 0.0905 | 0.0 | 0.0915 | 0.0912 | |
|
| 4.9098 | 2.8517 | 1500 | 18.8989 | 4.3731 | 0.0905 | 0.0 | 0.0915 | 0.0912 | |
|
| 4.4486 | 3.8023 | 2000 | 19.0 | 3.9340 | 0.0875 | 0.0 | 0.0885 | 0.0882 | |
|
| 4.0755 | 4.7529 | 2500 | 18.9382 | 3.5412 | 0.0881 | 0.0 | 0.0891 | 0.0887 | |
|
| 3.6998 | 5.7034 | 3000 | 18.8783 | 3.1344 | 0.095 | 0.0002 | 0.0958 | 0.0954 | |
|
| 3.3129 | 6.6540 | 3500 | 18.8408 | 2.8528 | 0.0935 | 0.0013 | 0.0945 | 0.094 | |
|
| 3.1053 | 7.6046 | 4000 | 18.9382 | 2.6196 | 0.0936 | 0.0008 | 0.0946 | 0.0941 | |
|
| 2.8412 | 8.5551 | 4500 | 18.867 | 2.4414 | 0.091 | 0.0011 | 0.0919 | 0.0915 | |
|
| 2.702 | 9.5057 | 5000 | 18.8783 | 2.2952 | 0.0936 | 0.001 | 0.0948 | 0.0946 | |
|
| 2.5611 | 10.4563 | 5500 | 19.0 | 2.1816 | 0.093 | 0.0011 | 0.0941 | 0.0936 | |
|
| 2.4499 | 11.4068 | 6000 | 18.8502 | 2.0914 | 0.0988 | 0.0011 | 0.0995 | 0.099 | |
|
| 2.3764 | 12.3574 | 6500 | 18.8371 | 2.0264 | 0.0992 | 0.0016 | 0.0997 | 0.0995 | |
|
| 2.3172 | 13.3080 | 7000 | 18.9888 | 1.9853 | 0.098 | 0.0015 | 0.099 | 0.0986 | |
|
| 2.2794 | 14.2586 | 7500 | 18.9888 | 1.9615 | 0.0971 | 0.0023 | 0.0977 | 0.0976 | |
|
| 2.2178 | 15.2091 | 8000 | 1.9424 | 0.0961 | 0.0009 | 0.0972 | 0.0968 | 19.0 | |
|
| 2.2378 | 16.1597 | 8500 | 1.8855 | 0.0935 | 0.0011 | 0.0942 | 0.0937 | 19.0 | |
|
| 2.1573 | 17.1103 | 9000 | 1.8386 | 0.0952 | 0.0009 | 0.0962 | 0.0958 | 19.0 | |
|
| 2.132 | 18.0608 | 9500 | 1.8055 | 0.0919 | 0.0012 | 0.0929 | 0.0923 | 18.8783 | |
|
| 2.1035 | 19.0114 | 10000 | 1.7863 | 0.0942 | 0.0015 | 0.0949 | 0.0945 | 19.0 | |
|
| 2.0818 | 19.9620 | 10500 | 1.7821 | 0.0926 | 0.0015 | 0.0934 | 0.0928 | 19.0 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.42.0.dev0 |
|
- Pytorch 2.3.0+cu121 |
|
- Datasets 2.19.1 |
|
- Tokenizers 0.19.1 |
|
|