File size: 3,846 Bytes
d2a2b69 8c9e67d d2a2b69 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
license: openrail
library_name: peft
tags:
- generated_from_trainer
base_model: VietAI/envit5-translation
metrics:
- bleu
model-index:
- name: envit5-MedEV
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# envit5-MedEV
This model is a fine-tuned version of [VietAI/envit5-translation](https://huggingface.co/VietAI/envit5-translation) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0795
- Bleu: 44.8343 -> 47.903 on MedEV test set
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 5
### Training results
| Training Loss | Epoch | Step | Validation Loss | Bleu |
|:-------------:|:------:|:-----:|:---------------:|:-------:|
| 33.2165 | 0.1314 | 700 | 0.5906 | 0.0653 |
| 0.4083 | 0.2628 | 1400 | 0.1096 | 13.8606 |
| 0.114 | 0.3942 | 2100 | 0.0918 | 14.7674 |
| 0.1027 | 0.5256 | 2800 | 0.0890 | 14.9410 |
| 0.0997 | 0.6571 | 3500 | 0.0873 | 15.0741 |
| 0.0973 | 0.7885 | 4200 | 0.0861 | 15.1717 |
| 0.0964 | 0.9199 | 4900 | 0.0852 | 15.2362 |
| 0.0949 | 1.0513 | 5600 | 0.0844 | 15.3131 |
| 0.0947 | 1.1827 | 6300 | 0.0838 | 15.3815 |
| 0.0937 | 1.3141 | 7000 | 0.0832 | 15.5075 |
| 0.0935 | 1.4455 | 7700 | 0.0827 | 15.5932 |
| 0.092 | 1.5769 | 8400 | 0.0822 | 15.6434 |
| 0.0924 | 1.7084 | 9100 | 0.0818 | 15.7233 |
| 0.0915 | 1.8398 | 9800 | 0.0815 | 15.8051 |
| 0.0915 | 1.9712 | 10500 | 0.0812 | 15.8279 |
| 0.0906 | 2.1026 | 11200 | 0.0809 | 15.8559 |
| 0.0904 | 2.2340 | 11900 | 0.0807 | 15.9008 |
| 0.0908 | 2.3654 | 12600 | 0.0805 | 15.8917 |
| 0.0904 | 2.4968 | 13300 | 0.0803 | 15.9352 |
| 0.0895 | 2.6282 | 14000 | 0.0802 | 15.9442 |
| 0.0896 | 2.7597 | 14700 | 0.0800 | 15.9677 |
| 0.0894 | 2.8911 | 15400 | 0.0800 | 15.9459 |
| 0.09 | 3.0225 | 16100 | 0.0799 | 15.9746 |
| 0.0895 | 3.1539 | 16800 | 0.0798 | 16.0154 |
| 0.0892 | 3.2853 | 17500 | 0.0797 | 15.9976 |
| 0.0896 | 3.4167 | 18200 | 0.0797 | 16.0193 |
| 0.0893 | 3.5481 | 18900 | 0.0796 | 16.0179 |
| 0.0888 | 3.6795 | 19600 | 0.0796 | 16.0510 |
| 0.0887 | 3.8110 | 20300 | 0.0796 | 16.0226 |
| 0.0891 | 3.9424 | 21000 | 0.0796 | 16.0277 |
| 0.0892 | 4.0738 | 21700 | 0.0796 | 16.0302 |
| 0.0892 | 4.2052 | 22400 | 0.0795 | 16.0425 |
| 0.0886 | 4.3366 | 23100 | 0.0795 | 16.0452 |
| 0.0889 | 4.4680 | 23800 | 0.0795 | 16.0518 |
| 0.0888 | 4.5994 | 24500 | 0.0795 | 16.0397 |
| 0.0893 | 4.7308 | 25200 | 0.0795 | 16.0450 |
| 0.0889 | 4.8623 | 25900 | 0.0795 | 16.0497 |
| 0.0887 | 4.9937 | 26600 | 0.0795 | 16.0497 |
### Framework versions
- PEFT 0.10.0
- Transformers 4.40.2
- Pytorch 2.3.0
- Datasets 2.19.1
- Tokenizers 0.19.1 |