hiba2's picture
End of training
baeda9e verified
|
raw
history blame
3.62 kB
---
base_model: hiba2/results_arat5-2_wiki
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: results_arat5-3_wiki
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# results_arat5-3_wiki
This model is a fine-tuned version of [hiba2/results_arat5-2_wiki](https://huggingface.co/hiba2/results_arat5-2_wiki) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.7821
- Rouge1: 0.0926
- Rouge2: 0.0015
- Rougel: 0.0934
- Rougelsum: 0.0928
- Gen Len: 19.0
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
### Training results
| Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-------:|:-----:|:-------:|:---------------:|:------:|:------:|:------:|:---------:|
| 7.8936 | 0.9506 | 500 | 0.0 | 5.9107 | 0.0 | 0.0 | 0.0 | 0.0 |
| 5.5649 | 1.9011 | 1000 | 18.8876 | 4.9336 | 0.0905 | 0.0 | 0.0915 | 0.0912 |
| 4.9098 | 2.8517 | 1500 | 18.8989 | 4.3731 | 0.0905 | 0.0 | 0.0915 | 0.0912 |
| 4.4486 | 3.8023 | 2000 | 19.0 | 3.9340 | 0.0875 | 0.0 | 0.0885 | 0.0882 |
| 4.0755 | 4.7529 | 2500 | 18.9382 | 3.5412 | 0.0881 | 0.0 | 0.0891 | 0.0887 |
| 3.6998 | 5.7034 | 3000 | 18.8783 | 3.1344 | 0.095 | 0.0002 | 0.0958 | 0.0954 |
| 3.3129 | 6.6540 | 3500 | 18.8408 | 2.8528 | 0.0935 | 0.0013 | 0.0945 | 0.094 |
| 3.1053 | 7.6046 | 4000 | 18.9382 | 2.6196 | 0.0936 | 0.0008 | 0.0946 | 0.0941 |
| 2.8412 | 8.5551 | 4500 | 18.867 | 2.4414 | 0.091 | 0.0011 | 0.0919 | 0.0915 |
| 2.702 | 9.5057 | 5000 | 18.8783 | 2.2952 | 0.0936 | 0.001 | 0.0948 | 0.0946 |
| 2.5611 | 10.4563 | 5500 | 19.0 | 2.1816 | 0.093 | 0.0011 | 0.0941 | 0.0936 |
| 2.4499 | 11.4068 | 6000 | 18.8502 | 2.0914 | 0.0988 | 0.0011 | 0.0995 | 0.099 |
| 2.3764 | 12.3574 | 6500 | 18.8371 | 2.0264 | 0.0992 | 0.0016 | 0.0997 | 0.0995 |
| 2.3172 | 13.3080 | 7000 | 18.9888 | 1.9853 | 0.098 | 0.0015 | 0.099 | 0.0986 |
| 2.2794 | 14.2586 | 7500 | 18.9888 | 1.9615 | 0.0971 | 0.0023 | 0.0977 | 0.0976 |
| 2.2178 | 15.2091 | 8000 | 1.9424 | 0.0961 | 0.0009 | 0.0972 | 0.0968 | 19.0 |
| 2.2378 | 16.1597 | 8500 | 1.8855 | 0.0935 | 0.0011 | 0.0942 | 0.0937 | 19.0 |
| 2.1573 | 17.1103 | 9000 | 1.8386 | 0.0952 | 0.0009 | 0.0962 | 0.0958 | 19.0 |
| 2.132 | 18.0608 | 9500 | 1.8055 | 0.0919 | 0.0012 | 0.0929 | 0.0923 | 18.8783 |
| 2.1035 | 19.0114 | 10000 | 1.7863 | 0.0942 | 0.0015 | 0.0949 | 0.0945 | 19.0 |
| 2.0818 | 19.9620 | 10500 | 1.7821 | 0.0926 | 0.0015 | 0.0934 | 0.0928 | 19.0 |
### Framework versions
- Transformers 4.42.0.dev0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1