beto2beto_tied_
This model is a fine-tuned version of dccuchile/bert-base-spanish-wwm-cased on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2598
- Rouge1: 89.2337
- Rouge2: 82.4747
- Rougel: 87.8665
- Rougelsum: 88.2241
- Gen Len: 44.8081
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
4.1482 | 1.0 | 970 | 2.9847 | 43.2986 | 18.5008 | 31.2325 | 32.3987 | 39.1515 |
2.0469 | 2.0 | 1940 | 0.9035 | 76.414 | 63.7537 | 73.4939 | 73.7343 | 43.6667 |
0.5005 | 3.0 | 2910 | 0.4559 | 84.5357 | 75.751 | 82.546 | 82.9252 | 45.0202 |
0.2288 | 4.0 | 3880 | 0.3991 | 86.1217 | 78.448 | 84.6786 | 85.0375 | 44.6465 |
0.1614 | 5.0 | 4850 | 0.3616 | 87.1616 | 80.0997 | 85.8677 | 86.2772 | 43.8990 |
0.1203 | 6.0 | 5820 | 0.3152 | 87.6801 | 80.9719 | 85.8049 | 86.2797 | 44.2828 |
0.1015 | 7.0 | 6790 | 0.2730 | 89.0506 | 82.7508 | 87.8147 | 88.1609 | 44.3232 |
0.0845 | 8.0 | 7760 | 0.3020 | 88.0917 | 81.1925 | 86.8684 | 87.1056 | 45.0606 |
0.0735 | 9.0 | 8730 | 0.2817 | 88.9092 | 82.7949 | 87.8403 | 88.0972 | 44.2525 |
0.0639 | 10.0 | 9700 | 0.2741 | 88.9576 | 83.3882 | 88.0209 | 88.1885 | 44.2424 |
0.0575 | 11.0 | 10670 | 0.2676 | 88.3211 | 81.5339 | 86.8743 | 87.2136 | 44.8889 |
0.051 | 12.0 | 11640 | 0.2653 | 88.3985 | 81.8103 | 87.1116 | 87.4471 | 44.7879 |
0.0443 | 13.0 | 12610 | 0.2802 | 88.5347 | 82.0431 | 87.0133 | 87.3694 | 45.2323 |
0.0403 | 14.0 | 13580 | 0.2918 | 88.5383 | 82.0573 | 87.3892 | 87.6552 | 43.9495 |
0.0361 | 15.0 | 14550 | 0.2715 | 89.0545 | 82.4233 | 87.6459 | 87.9061 | 45.2626 |
0.0323 | 16.0 | 15520 | 0.2829 | 88.6251 | 82.3048 | 87.531 | 87.8254 | 44.5859 |
0.0273 | 17.0 | 16490 | 0.2689 | 89.1621 | 82.8361 | 87.6537 | 87.9574 | 44.8687 |
0.0254 | 18.0 | 17460 | 0.2611 | 88.9807 | 82.198 | 87.6945 | 87.9456 | 45.2929 |
0.0229 | 19.0 | 18430 | 0.2701 | 89.432 | 82.6393 | 88.0695 | 88.4232 | 44.6869 |
0.0213 | 20.0 | 19400 | 0.2598 | 89.2337 | 82.4747 | 87.8665 | 88.2241 | 44.8081 |
Framework versions
- Transformers 4.28.0.dev0
- Pytorch 1.13.1+cu117
- Datasets 2.9.0
- Tokenizers 0.13.2
- Downloads last month
- 0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.