metadata
license: llama3
library_name: peft
tags:
- generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model-index:
- name: Llama3_devops
results: []
Llama3_devops
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.3252
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- lr_scheduler_warmup_steps: 100
- training_steps: 12001
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.4137 | 0.0612 | 100 | 1.7127 |
1.4861 | 0.1224 | 200 | 1.6550 |
1.3809 | 0.1837 | 300 | 1.6320 |
1.6918 | 0.2449 | 400 | 1.6155 |
1.5341 | 0.3061 | 500 | 1.6085 |
1.326 | 0.3673 | 600 | 1.6069 |
1.4157 | 0.4285 | 700 | 1.6039 |
1.477 | 0.4897 | 800 | 1.5980 |
2.091 | 0.5510 | 900 | 1.5930 |
1.4464 | 0.6122 | 1000 | 1.5901 |
1.5648 | 0.6734 | 1100 | 1.5888 |
1.7804 | 0.7346 | 1200 | 1.5885 |
1.7443 | 0.7958 | 1300 | 1.5874 |
1.721 | 0.8571 | 1400 | 1.5850 |
1.5615 | 0.9183 | 1500 | 1.5828 |
1.5138 | 0.9795 | 1600 | 1.5816 |
2.0057 | 1.0407 | 1700 | 1.5811 |
1.6474 | 1.1019 | 1800 | 1.5811 |
1.8227 | 1.1635 | 1900 | 1.5812 |
1.3724 | 1.2247 | 2000 | 1.5799 |
1.2722 | 1.2859 | 2100 | 1.5790 |
1.5611 | 1.3471 | 2200 | 1.5784 |
1.5327 | 1.4083 | 2300 | 1.5782 |
1.5264 | 1.4695 | 2400 | 1.5782 |
1.5766 | 1.5308 | 2500 | 1.5779 |
1.7018 | 1.5920 | 2600 | 1.5772 |
1.201 | 1.6532 | 2700 | 1.5765 |
1.4864 | 1.7144 | 2800 | 1.5762 |
1.2907 | 1.7756 | 2900 | 1.5760 |
1.6052 | 1.8369 | 3000 | 1.5760 |
1.3841 | 1.3711 | 3100 | 1.3650 |
1.3509 | 1.4153 | 3200 | 1.3555 |
1.349 | 1.4595 | 3300 | 1.3518 |
1.4748 | 1.5038 | 3400 | 1.3499 |
1.0276 | 1.5480 | 3500 | 1.3492 |
1.3901 | 1.5922 | 3600 | 1.3491 |
1.2557 | 1.6364 | 3700 | 1.3447 |
1.146 | 1.6807 | 3800 | 1.3422 |
1.3166 | 1.7249 | 3900 | 1.3408 |
1.4498 | 1.7691 | 4000 | 1.3401 |
1.2284 | 1.8134 | 4100 | 1.3399 |
1.2182 | 1.8576 | 4200 | 1.3398 |
1.2163 | 1.9018 | 4300 | 1.3379 |
1.2242 | 1.9460 | 4400 | 1.3367 |
1.2829 | 1.9903 | 4500 | 1.3360 |
1.214 | 2.0345 | 4600 | 1.3356 |
1.2161 | 2.0787 | 4700 | 1.3355 |
1.2942 | 2.1230 | 4800 | 1.3355 |
1.2288 | 2.1672 | 4900 | 1.3343 |
1.3177 | 2.2114 | 5000 | 1.3337 |
1.3833 | 2.2556 | 5100 | 1.3332 |
1.658 | 2.2999 | 5200 | 1.3329 |
1.3888 | 2.3441 | 5300 | 1.3329 |
1.3027 | 2.3883 | 5400 | 1.3328 |
1.4974 | 2.4326 | 5500 | 1.3321 |
1.1546 | 2.4768 | 5600 | 1.3316 |
1.2156 | 2.5210 | 5700 | 1.3313 |
1.3549 | 2.5652 | 5800 | 1.3311 |
1.3213 | 2.6095 | 5900 | 1.3310 |
1.3492 | 2.6537 | 6000 | 1.3310 |
1.3454 | 2.6979 | 6100 | 1.3306 |
1.4238 | 2.7421 | 6200 | 1.3302 |
1.4476 | 2.7864 | 6300 | 1.3299 |
1.2525 | 2.8306 | 6400 | 1.3298 |
1.343 | 2.8748 | 6500 | 1.3298 |
1.3299 | 2.9191 | 6600 | 1.3298 |
1.4081 | 2.9633 | 6700 | 1.3293 |
1.4621 | 3.0075 | 6800 | 1.3290 |
1.0876 | 3.0517 | 6900 | 1.3289 |
1.3061 | 3.0960 | 7000 | 1.3288 |
1.2202 | 3.1402 | 7100 | 1.3287 |
1.3105 | 3.1844 | 7200 | 1.3287 |
1.3631 | 3.2287 | 7300 | 1.3284 |
1.3136 | 3.2729 | 7400 | 1.3282 |
1.442 | 3.3171 | 7500 | 1.3281 |
1.3141 | 3.3613 | 7600 | 1.3280 |
1.3445 | 3.4056 | 7700 | 1.3280 |
1.2843 | 3.4498 | 7800 | 1.3279 |
1.342 | 3.4940 | 7900 | 1.3277 |
1.2877 | 3.5383 | 8000 | 1.3275 |
1.4434 | 3.5825 | 8100 | 1.3274 |
1.2827 | 3.6267 | 8200 | 1.3273 |
1.1758 | 3.6709 | 8300 | 1.3273 |
1.3382 | 3.7152 | 8400 | 1.3273 |
1.2126 | 3.7594 | 8500 | 1.3271 |
1.4859 | 3.8036 | 8600 | 1.3270 |
1.1627 | 3.8479 | 8700 | 1.3269 |
1.5215 | 3.8921 | 8800 | 1.3268 |
1.6232 | 3.9363 | 8900 | 1.3268 |
1.3434 | 3.9805 | 9000 | 1.3268 |
1.1927 | 4.0248 | 9100 | 1.3267 |
1.2415 | 4.0690 | 9200 | 1.3265 |
1.1639 | 4.1132 | 9300 | 1.3264 |
1.2402 | 4.1575 | 9400 | 1.3264 |
1.295 | 4.2017 | 9500 | 1.3264 |
1.1189 | 4.2459 | 9600 | 1.3264 |
1.2794 | 4.2901 | 9700 | 1.3263 |
1.1904 | 4.3344 | 9800 | 1.3261 |
1.1547 | 4.3786 | 9900 | 1.3261 |
1.3298 | 4.4228 | 10000 | 1.3260 |
1.1915 | 4.4670 | 10100 | 1.3260 |
1.2256 | 4.5113 | 10200 | 1.3260 |
1.3068 | 4.5555 | 10300 | 1.3259 |
1.5124 | 4.5997 | 10400 | 1.3258 |
1.3894 | 4.6440 | 10500 | 1.3258 |
1.1934 | 4.6882 | 10600 | 1.3257 |
1.2746 | 4.7324 | 10700 | 1.3257 |
1.2689 | 4.7766 | 10800 | 1.3257 |
1.3315 | 4.8209 | 10900 | 1.3256 |
1.4784 | 4.8651 | 11000 | 1.3255 |
1.2925 | 4.9093 | 11100 | 1.3255 |
1.2004 | 4.9536 | 11200 | 1.3254 |
1.4289 | 4.9978 | 11300 | 1.3254 |
1.354 | 5.0420 | 11400 | 1.3254 |
1.1891 | 5.0862 | 11500 | 1.3253 |
1.3498 | 5.1305 | 11600 | 1.3253 |
1.3814 | 5.1747 | 11700 | 1.3252 |
1.4559 | 5.2189 | 11800 | 1.3252 |
1.2006 | 5.2632 | 11900 | 1.3252 |
1.3107 | 5.3074 | 12000 | 1.3252 |
Framework versions
- PEFT 0.11.1
- Transformers 4.41.2
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.19.1