Llama-3.2-3B-lora-rps-adapter
This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4600
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.1498 | 4.1213 | 25000 | 0.3963 |
0.1593 | 4.1378 | 25100 | 0.3952 |
0.1645 | 4.1543 | 25200 | 0.3958 |
0.1665 | 4.1708 | 25300 | 0.3943 |
0.1602 | 4.1873 | 25400 | 0.3938 |
0.1604 | 4.2038 | 25500 | 0.3921 |
0.1722 | 4.2202 | 25600 | 0.3916 |
0.1812 | 4.2367 | 25700 | 0.3915 |
0.1658 | 4.2532 | 25800 | 0.4006 |
0.1677 | 4.2697 | 25900 | 0.3978 |
0.1749 | 4.2862 | 26000 | 0.3941 |
0.1819 | 4.3027 | 26100 | 0.3924 |
0.1747 | 4.3192 | 26200 | 0.3941 |
0.1602 | 4.3356 | 26300 | 0.3945 |
0.1659 | 4.3521 | 26400 | 0.3957 |
0.1692 | 4.3686 | 26500 | 0.3951 |
0.1759 | 4.3851 | 26600 | 0.3945 |
0.1714 | 4.4016 | 26700 | 0.3934 |
0.1678 | 4.4181 | 26800 | 0.3925 |
0.1604 | 4.4346 | 26900 | 0.3947 |
0.1694 | 4.4510 | 27000 | 0.3981 |
0.1761 | 4.4675 | 27100 | 0.3931 |
0.189 | 4.4840 | 27200 | 0.3926 |
0.1892 | 4.5005 | 27300 | 0.3933 |
0.1713 | 4.5170 | 27400 | 0.3941 |
0.1693 | 4.5335 | 27500 | 0.3941 |
0.1721 | 4.5500 | 27600 | 0.3932 |
0.187 | 4.5664 | 27700 | 0.3932 |
0.1684 | 4.5829 | 27800 | 0.3960 |
0.1851 | 4.5994 | 27900 | 0.3933 |
0.169 | 4.6159 | 28000 | 0.3931 |
0.1675 | 4.6324 | 28100 | 0.3895 |
0.177 | 4.6489 | 28200 | 0.3916 |
0.183 | 4.6653 | 28300 | 0.3915 |
0.1834 | 4.6818 | 28400 | 0.3839 |
0.1839 | 4.6983 | 28500 | 0.3863 |
0.1785 | 4.7148 | 28600 | 0.3889 |
0.1762 | 4.7313 | 28700 | 0.3859 |
0.1805 | 4.7478 | 28800 | 0.3878 |
0.1616 | 4.7643 | 28900 | 0.3866 |
0.1796 | 4.7807 | 29000 | 0.3832 |
0.1797 | 4.7972 | 29100 | 0.3860 |
0.1641 | 4.8137 | 29200 | 0.3856 |
0.1844 | 4.8302 | 29300 | 0.3855 |
0.1736 | 4.8467 | 29400 | 0.3861 |
0.163 | 4.8632 | 29500 | 0.3850 |
0.2074 | 4.8797 | 29600 | 0.3880 |
0.1709 | 4.8961 | 29700 | 0.3884 |
0.1682 | 4.9126 | 29800 | 0.3855 |
0.1811 | 4.9291 | 29900 | 0.3883 |
0.1671 | 4.9456 | 30000 | 0.3872 |
0.1796 | 4.9621 | 30100 | 0.3863 |
0.1646 | 4.9786 | 30200 | 0.3853 |
0.1624 | 4.9951 | 30300 | 0.3872 |
0.1317 | 5.0115 | 30400 | 0.4029 |
0.1126 | 5.0280 | 30500 | 0.4068 |
0.1344 | 5.0445 | 30600 | 0.4076 |
0.1202 | 5.0610 | 30700 | 0.4078 |
0.1267 | 5.0775 | 30800 | 0.4077 |
0.1288 | 5.0940 | 30900 | 0.4058 |
0.1216 | 5.1105 | 31000 | 0.4117 |
0.1142 | 5.1269 | 31100 | 0.4109 |
0.1221 | 5.1434 | 31200 | 0.4053 |
0.1234 | 5.1599 | 31300 | 0.4092 |
0.1232 | 5.1764 | 31400 | 0.4098 |
0.1269 | 5.1929 | 31500 | 0.4102 |
0.1169 | 5.2094 | 31600 | 0.4068 |
0.1385 | 5.2258 | 31700 | 0.4055 |
0.1163 | 5.2423 | 31800 | 0.4106 |
0.1233 | 5.2588 | 31900 | 0.4075 |
0.116 | 5.2753 | 32000 | 0.4088 |
0.1336 | 5.2918 | 32100 | 0.4045 |
0.1167 | 5.3083 | 32200 | 0.4101 |
0.1202 | 5.3248 | 32300 | 0.4076 |
0.1229 | 5.3412 | 32400 | 0.4091 |
0.1213 | 5.3577 | 32500 | 0.4078 |
0.1316 | 5.3742 | 32600 | 0.4075 |
0.1245 | 5.3907 | 32700 | 0.4067 |
0.1208 | 5.4072 | 32800 | 0.4083 |
0.1281 | 5.4237 | 32900 | 0.4089 |
0.1214 | 5.4402 | 33000 | 0.4094 |
0.1149 | 5.4566 | 33100 | 0.4072 |
0.1218 | 5.4731 | 33200 | 0.4060 |
0.1178 | 5.4896 | 33300 | 0.4079 |
0.1272 | 5.5061 | 33400 | 0.4057 |
0.1258 | 5.5226 | 33500 | 0.4080 |
0.1213 | 5.5391 | 33600 | 0.4089 |
0.1161 | 5.5556 | 33700 | 0.4121 |
0.1325 | 5.5720 | 33800 | 0.4057 |
0.1219 | 5.5885 | 33900 | 0.4083 |
0.1247 | 5.6050 | 34000 | 0.4074 |
0.1233 | 5.6215 | 34100 | 0.4084 |
0.1211 | 5.6380 | 34200 | 0.4091 |
0.1315 | 5.6545 | 34300 | 0.4090 |
0.1183 | 5.6710 | 34400 | 0.4084 |
0.1256 | 5.6874 | 34500 | 0.4088 |
0.1168 | 5.7039 | 34600 | 0.4079 |
0.1394 | 5.7204 | 34700 | 0.4050 |
0.124 | 5.7369 | 34800 | 0.4065 |
0.1299 | 5.7534 | 34900 | 0.4052 |
0.1152 | 5.7699 | 35000 | 0.4039 |
0.138 | 5.7864 | 35100 | 0.4050 |
0.1137 | 5.8028 | 35200 | 0.4073 |
0.1284 | 5.8193 | 35300 | 0.4027 |
0.1192 | 5.8358 | 35400 | 0.4045 |
0.1358 | 5.8523 | 35500 | 0.4051 |
0.1262 | 5.8688 | 35600 | 0.4035 |
0.1289 | 5.8853 | 35700 | 0.4049 |
0.1296 | 5.9017 | 35800 | 0.4059 |
0.1319 | 5.9182 | 35900 | 0.4051 |
0.1259 | 5.9347 | 36000 | 0.4025 |
0.1217 | 5.9512 | 36100 | 0.4068 |
0.1127 | 5.9677 | 36200 | 0.4058 |
0.1216 | 5.9842 | 36300 | 0.4020 |
0.1279 | 6.0007 | 36400 | 0.4037 |
0.0806 | 6.0171 | 36500 | 0.4302 |
0.0795 | 6.0336 | 36600 | 0.4248 |
0.0861 | 6.0501 | 36700 | 0.4310 |
0.0891 | 6.0666 | 36800 | 0.4311 |
0.0771 | 6.0831 | 36900 | 0.4324 |
0.0757 | 6.0996 | 37000 | 0.4304 |
0.0777 | 6.1161 | 37100 | 0.4297 |
0.0753 | 6.1325 | 37200 | 0.4281 |
0.0822 | 6.1490 | 37300 | 0.4284 |
0.0799 | 6.1655 | 37400 | 0.4320 |
0.0915 | 6.1820 | 37500 | 0.4298 |
0.0772 | 6.1985 | 37600 | 0.4291 |
0.0797 | 6.2150 | 37700 | 0.4269 |
0.0854 | 6.2315 | 37800 | 0.4307 |
0.0838 | 6.2479 | 37900 | 0.4309 |
0.0935 | 6.2644 | 38000 | 0.4262 |
0.0864 | 6.2809 | 38100 | 0.4247 |
0.0847 | 6.2974 | 38200 | 0.4272 |
0.08 | 6.3139 | 38300 | 0.4311 |
0.0909 | 6.3304 | 38400 | 0.4327 |
0.0822 | 6.3469 | 38500 | 0.4263 |
0.0808 | 6.3633 | 38600 | 0.4310 |
0.0867 | 6.3798 | 38700 | 0.4285 |
0.0795 | 6.3963 | 38800 | 0.4298 |
0.097 | 6.4128 | 38900 | 0.4301 |
0.0802 | 6.4293 | 39000 | 0.4248 |
0.0937 | 6.4458 | 39100 | 0.4310 |
0.0808 | 6.4622 | 39200 | 0.4270 |
0.0773 | 6.4787 | 39300 | 0.4314 |
0.0853 | 6.4952 | 39400 | 0.4296 |
0.0831 | 6.5117 | 39500 | 0.4305 |
0.0935 | 6.5282 | 39600 | 0.4278 |
0.0839 | 6.5447 | 39700 | 0.4269 |
0.079 | 6.5612 | 39800 | 0.4299 |
0.0767 | 6.5776 | 39900 | 0.4286 |
0.0817 | 6.5941 | 40000 | 0.4275 |
0.0795 | 6.6106 | 40100 | 0.4289 |
0.0886 | 6.6271 | 40200 | 0.4236 |
0.0827 | 6.6436 | 40300 | 0.4298 |
0.0845 | 6.6601 | 40400 | 0.4281 |
0.0817 | 6.6766 | 40500 | 0.4267 |
0.0843 | 6.6930 | 40600 | 0.4258 |
0.0772 | 6.7095 | 40700 | 0.4268 |
0.0801 | 6.7260 | 40800 | 0.4315 |
0.0828 | 6.7425 | 40900 | 0.4267 |
0.087 | 6.7590 | 41000 | 0.4271 |
0.0901 | 6.7755 | 41100 | 0.4280 |
0.0788 | 6.7920 | 41200 | 0.4275 |
0.0815 | 6.8084 | 41300 | 0.4291 |
0.0774 | 6.8249 | 41400 | 0.4286 |
0.0868 | 6.8414 | 41500 | 0.4275 |
0.0834 | 6.8579 | 41600 | 0.4265 |
0.0846 | 6.8744 | 41700 | 0.4257 |
0.0798 | 6.8909 | 41800 | 0.4257 |
0.0795 | 6.9074 | 41900 | 0.4277 |
0.0849 | 6.9238 | 42000 | 0.4292 |
0.0936 | 6.9403 | 42100 | 0.4267 |
0.0715 | 6.9568 | 42200 | 0.4312 |
0.0857 | 6.9733 | 42300 | 0.4291 |
0.0887 | 6.9898 | 42400 | 0.4275 |
0.0681 | 7.0063 | 42500 | 0.4423 |
0.0567 | 7.0227 | 42600 | 0.4505 |
0.067 | 7.0392 | 42700 | 0.4514 |
0.0631 | 7.0557 | 42800 | 0.4559 |
0.0515 | 7.0722 | 42900 | 0.4562 |
0.0559 | 7.0887 | 43000 | 0.4583 |
0.0501 | 7.1052 | 43100 | 0.4566 |
0.0529 | 7.1217 | 43200 | 0.4567 |
0.0514 | 7.1381 | 43300 | 0.4554 |
0.0528 | 7.1546 | 43400 | 0.4566 |
0.053 | 7.1711 | 43500 | 0.4562 |
0.053 | 7.1876 | 43600 | 0.4569 |
0.0531 | 7.2041 | 43700 | 0.4555 |
0.0479 | 7.2206 | 43800 | 0.4595 |
0.0524 | 7.2371 | 43900 | 0.4567 |
0.0502 | 7.2535 | 44000 | 0.4605 |
0.0488 | 7.2700 | 44100 | 0.4591 |
0.0551 | 7.2865 | 44200 | 0.4603 |
0.0557 | 7.3030 | 44300 | 0.4580 |
0.0522 | 7.3195 | 44400 | 0.4599 |
0.0583 | 7.3360 | 44500 | 0.4583 |
0.0525 | 7.3525 | 44600 | 0.4585 |
0.0557 | 7.3689 | 44700 | 0.4572 |
0.0521 | 7.3854 | 44800 | 0.4579 |
0.0523 | 7.4019 | 44900 | 0.4578 |
0.0498 | 7.4184 | 45000 | 0.4585 |
0.0551 | 7.4349 | 45100 | 0.4585 |
0.0472 | 7.4514 | 45200 | 0.4592 |
0.0511 | 7.4679 | 45300 | 0.4595 |
0.0579 | 7.4843 | 45400 | 0.4593 |
0.0521 | 7.5008 | 45500 | 0.4597 |
0.0551 | 7.5173 | 45600 | 0.4593 |
0.0539 | 7.5338 | 45700 | 0.4579 |
0.0557 | 7.5503 | 45800 | 0.4571 |
0.0526 | 7.5668 | 45900 | 0.4602 |
0.0497 | 7.5833 | 46000 | 0.4582 |
0.0487 | 7.5997 | 46100 | 0.4600 |
0.0498 | 7.6162 | 46200 | 0.4586 |
0.0542 | 7.6327 | 46300 | 0.4596 |
0.0496 | 7.6492 | 46400 | 0.4608 |
0.0467 | 7.6657 | 46500 | 0.4593 |
0.0524 | 7.6822 | 46600 | 0.4597 |
0.0512 | 7.6986 | 46700 | 0.4599 |
0.0536 | 7.7151 | 46800 | 0.4593 |
0.0483 | 7.7316 | 46900 | 0.4605 |
0.0477 | 7.7481 | 47000 | 0.4593 |
0.0618 | 7.7646 | 47100 | 0.4581 |
0.0531 | 7.7811 | 47200 | 0.4585 |
0.0561 | 7.7976 | 47300 | 0.4596 |
0.0521 | 7.8140 | 47400 | 0.4594 |
0.0473 | 7.8305 | 47500 | 0.4608 |
0.051 | 7.8470 | 47600 | 0.4609 |
0.0494 | 7.8635 | 47700 | 0.4609 |
0.048 | 7.8800 | 47800 | 0.4607 |
0.0533 | 7.8965 | 47900 | 0.4606 |
0.0514 | 7.9130 | 48000 | 0.4607 |
0.0517 | 7.9294 | 48100 | 0.4607 |
0.0494 | 7.9459 | 48200 | 0.4606 |
0.0517 | 7.9624 | 48300 | 0.4601 |
0.0468 | 7.9789 | 48400 | 0.4601 |
0.0526 | 7.9954 | 48500 | 0.4600 |
Framework versions
- PEFT 0.13.1
- Transformers 4.45.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.1
- Tokenizers 0.20.0
- Downloads last month
- 168
Model tree for SimonMA/Llama-3.2-3B-lora-rps-adapter
Base model
meta-llama/Llama-3.2-3B-Instruct