|
--- |
|
license: apache-2.0 |
|
base_model: google/flan-t5-base |
|
tags: |
|
- generated_from_trainer |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: flan-t5-base-YT-transcript-sum |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# flan-t5-base-YT-transcript-sum |
|
|
|
This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 1.4111 |
|
- Rouge1: 25.4013 |
|
- Rouge2: 12.4728 |
|
- Rougel: 21.5206 |
|
- Rougelsum: 23.6322 |
|
- Gen Len: 19.0 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 5e-05 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 4 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 15 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| |
|
| No log | 1.0 | 216 | 1.5817 | 23.8566 | 11.0314 | 20.1664 | 22.2953 | 18.9954 | |
|
| No log | 2.0 | 432 | 1.4907 | 24.2446 | 11.6603 | 20.6712 | 22.4196 | 18.9861 | |
|
| 1.7643 | 3.0 | 648 | 1.4510 | 25.4355 | 12.9236 | 21.584 | 23.7272 | 19.0 | |
|
| 1.7643 | 4.0 | 864 | 1.4312 | 24.8929 | 12.5927 | 21.3295 | 23.3504 | 19.0 | |
|
| 1.4359 | 5.0 | 1080 | 1.4145 | 25.242 | 12.9269 | 21.6351 | 23.6509 | 19.0 | |
|
| 1.4359 | 6.0 | 1296 | 1.4111 | 25.4013 | 12.4728 | 21.5206 | 23.6322 | 19.0 | |
|
| 1.2819 | 7.0 | 1512 | 1.4135 | 25.6542 | 13.103 | 22.2059 | 23.9474 | 19.0 | |
|
| 1.2819 | 8.0 | 1728 | 1.4145 | 26.0783 | 13.7584 | 22.343 | 24.3255 | 19.0 | |
|
| 1.2819 | 9.0 | 1944 | 1.4163 | 25.4385 | 13.1278 | 21.7173 | 23.8295 | 18.9861 | |
|
| 1.1688 | 10.0 | 2160 | 1.4208 | 25.7625 | 13.5586 | 22.2246 | 24.2042 | 19.0 | |
|
| 1.1688 | 11.0 | 2376 | 1.4165 | 25.5482 | 13.1163 | 21.9475 | 23.8181 | 18.9907 | |
|
| 1.0951 | 12.0 | 2592 | 1.4215 | 25.7614 | 13.5565 | 22.1965 | 24.0657 | 19.0 | |
|
| 1.0951 | 13.0 | 2808 | 1.4285 | 26.3345 | 14.2027 | 22.7422 | 24.6261 | 18.9907 | |
|
| 1.0549 | 14.0 | 3024 | 1.4277 | 25.8835 | 13.8044 | 22.3845 | 24.269 | 19.0 | |
|
| 1.0549 | 15.0 | 3240 | 1.4321 | 25.8292 | 13.7231 | 22.3506 | 24.3188 | 19.0 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.33.2 |
|
- Pytorch 2.0.1+cu117 |
|
- Datasets 2.14.5 |
|
- Tokenizers 0.13.3 |
|
|