--- base_model: meta-llama/Llama-2-7b-hf tags: - generated_from_trainer model-index: - name: mid-nids results: [] --- [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) # mid-nids This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.0342 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 2 - total_train_batch_size: 8 - total_eval_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 10 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.0682 | 0.03 | 20 | 0.0982 | | 0.0895 | 0.06 | 40 | 0.0792 | | 0.015 | 0.09 | 60 | 0.0405 | | 0.0376 | 0.11 | 80 | 0.0357 | | 0.0196 | 0.14 | 100 | 0.0342 | | 0.0219 | 0.17 | 120 | 0.0334 | | 0.0188 | 0.2 | 140 | 0.0317 | | 0.0147 | 0.23 | 160 | 0.0365 | | 0.0224 | 0.26 | 180 | 0.0388 | | 0.0116 | 0.28 | 200 | 0.0504 | | 0.0158 | 0.31 | 220 | 0.0692 | | 0.0193 | 0.34 | 240 | 0.0407 | | 0.0181 | 0.37 | 260 | 0.0443 | | 0.0124 | 0.4 | 280 | 0.0482 | | 0.0094 | 0.43 | 300 | 0.0549 | | 0.0081 | 0.46 | 320 | 0.0341 | | 0.0188 | 0.48 | 340 | 0.0401 | | 0.021 | 0.51 | 360 | 0.0508 | | 0.0125 | 0.54 | 380 | 0.0409 | | 0.0071 | 0.57 | 400 | 0.0424 | | 0.0165 | 0.6 | 420 | 0.0566 | | 0.0075 | 0.63 | 440 | 0.0537 | | 0.0096 | 0.65 | 460 | 0.0338 | | 0.012 | 0.68 | 480 | 0.0489 | | 0.0041 | 0.71 | 500 | 0.0442 | | 0.0012 | 0.74 | 520 | 0.0439 | | 0.0096 | 0.77 | 540 | 0.0381 | | 0.005 | 0.8 | 560 | 0.0449 | | 0.0239 | 0.83 | 580 | 0.0452 | | 0.0166 | 0.85 | 600 | 0.0383 | | 0.0081 | 0.88 | 620 | 0.0249 | | 0.0166 | 0.91 | 640 | 0.0442 | | 0.0106 | 0.94 | 660 | 0.0327 | | 0.0161 | 0.97 | 680 | 0.0386 | | 0.0038 | 1.0 | 700 | 0.0377 | | 0.0029 | 1.02 | 720 | 0.0367 | | 0.0164 | 1.05 | 740 | 0.0276 | | 0.0128 | 1.08 | 760 | 0.0259 | | 0.0108 | 1.11 | 780 | 0.0294 | | 0.026 | 1.14 | 800 | 0.0285 | | 0.0104 | 1.17 | 820 | 0.0297 | | 0.0102 | 1.19 | 840 | 0.0271 | | 0.0111 | 1.22 | 860 | 0.0293 | | 0.0088 | 1.25 | 880 | 0.0305 | | 0.0116 | 1.28 | 900 | 0.0250 | | 0.0066 | 1.31 | 920 | 0.0442 | | 0.0061 | 1.34 | 940 | 0.0309 | | 0.0173 | 1.37 | 960 | 0.0231 | | 0.0032 | 1.39 | 980 | 0.0230 | | 0.0119 | 1.42 | 1000 | 0.0401 | | 0.0083 | 1.45 | 1020 | 0.0274 | | 0.0047 | 1.48 | 1040 | 0.0359 | | 0.0221 | 1.51 | 1060 | 0.0301 | | 0.0038 | 1.54 | 1080 | 0.0280 | | 0.0052 | 1.56 | 1100 | 0.0235 | | 0.0084 | 1.59 | 1120 | 0.0323 | | 0.012 | 1.62 | 1140 | 0.0320 | | 0.0019 | 1.65 | 1160 | 0.0256 | | 0.0175 | 1.68 | 1180 | 0.0300 | | 0.0078 | 1.71 | 1200 | 0.0362 | | 0.0088 | 1.74 | 1220 | 0.0310 | | 0.0065 | 1.76 | 1240 | 0.0301 | | 0.0059 | 1.79 | 1260 | 0.0348 | | 0.0066 | 1.82 | 1280 | 0.0341 | | 0.0015 | 1.85 | 1300 | 0.0280 | | 0.0091 | 1.88 | 1320 | 0.0266 | | 0.0053 | 1.91 | 1340 | 0.0350 | | 0.0077 | 1.93 | 1360 | 0.0333 | | 0.0081 | 1.96 | 1380 | 0.0320 | | 0.0129 | 1.99 | 1400 | 0.0391 | | 0.0082 | 2.02 | 1420 | 0.0388 | | 0.008 | 2.05 | 1440 | 0.0212 | | 0.0025 | 2.08 | 1460 | 0.0362 | | 0.0006 | 2.11 | 1480 | 0.0289 | | 0.0034 | 2.13 | 1500 | 0.0347 | | 0.0115 | 2.16 | 1520 | 0.0313 | | 0.0061 | 2.19 | 1540 | 0.0297 | | 0.0065 | 2.22 | 1560 | 0.0335 | | 0.0144 | 2.25 | 1580 | 0.0379 | | 0.0075 | 2.28 | 1600 | 0.0300 | | 0.0093 | 2.3 | 1620 | 0.0322 | | 0.0091 | 2.33 | 1640 | 0.0313 | | 0.0051 | 2.36 | 1660 | 0.0278 | | 0.0046 | 2.39 | 1680 | 0.0294 | | 0.0004 | 2.42 | 1700 | 0.0283 | | 0.0054 | 2.45 | 1720 | 0.0296 | | 0.0034 | 2.48 | 1740 | 0.0337 | | 0.0065 | 2.5 | 1760 | 0.0341 | | 0.0034 | 2.53 | 1780 | 0.0345 | | 0.0114 | 2.56 | 1800 | 0.0371 | | 0.0044 | 2.59 | 1820 | 0.0377 | | 0.0086 | 2.62 | 1840 | 0.0344 | | 0.0065 | 2.65 | 1860 | 0.0332 | | 0.0051 | 2.67 | 1880 | 0.0344 | | 0.008 | 2.7 | 1900 | 0.0355 | | 0.0035 | 2.73 | 1920 | 0.0351 | | 0.0065 | 2.76 | 1940 | 0.0352 | | 0.0097 | 2.79 | 1960 | 0.0347 | | 0.0034 | 2.82 | 1980 | 0.0347 | | 0.0054 | 2.84 | 2000 | 0.0348 | | 0.0045 | 2.87 | 2020 | 0.0344 | | 0.0032 | 2.9 | 2040 | 0.0343 | | 0.0072 | 2.93 | 2060 | 0.0342 | | 0.0074 | 2.96 | 2080 | 0.0344 | | 0.0111 | 2.99 | 2100 | 0.0342 | ### Framework versions - Transformers 4.34.1 - Pytorch 2.0.1+cu117 - Datasets 2.14.6 - Tokenizers 0.14.1