metadata
base_model: meta-llama/Llama-2-7b-hf
tags:
- generated_from_trainer
model-index:
- name: mid-nids
results: []
mid-nids
This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0342
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.0682 | 0.03 | 20 | 0.0982 |
0.0895 | 0.06 | 40 | 0.0792 |
0.015 | 0.09 | 60 | 0.0405 |
0.0376 | 0.11 | 80 | 0.0357 |
0.0196 | 0.14 | 100 | 0.0342 |
0.0219 | 0.17 | 120 | 0.0334 |
0.0188 | 0.2 | 140 | 0.0317 |
0.0147 | 0.23 | 160 | 0.0365 |
0.0224 | 0.26 | 180 | 0.0388 |
0.0116 | 0.28 | 200 | 0.0504 |
0.0158 | 0.31 | 220 | 0.0692 |
0.0193 | 0.34 | 240 | 0.0407 |
0.0181 | 0.37 | 260 | 0.0443 |
0.0124 | 0.4 | 280 | 0.0482 |
0.0094 | 0.43 | 300 | 0.0549 |
0.0081 | 0.46 | 320 | 0.0341 |
0.0188 | 0.48 | 340 | 0.0401 |
0.021 | 0.51 | 360 | 0.0508 |
0.0125 | 0.54 | 380 | 0.0409 |
0.0071 | 0.57 | 400 | 0.0424 |
0.0165 | 0.6 | 420 | 0.0566 |
0.0075 | 0.63 | 440 | 0.0537 |
0.0096 | 0.65 | 460 | 0.0338 |
0.012 | 0.68 | 480 | 0.0489 |
0.0041 | 0.71 | 500 | 0.0442 |
0.0012 | 0.74 | 520 | 0.0439 |
0.0096 | 0.77 | 540 | 0.0381 |
0.005 | 0.8 | 560 | 0.0449 |
0.0239 | 0.83 | 580 | 0.0452 |
0.0166 | 0.85 | 600 | 0.0383 |
0.0081 | 0.88 | 620 | 0.0249 |
0.0166 | 0.91 | 640 | 0.0442 |
0.0106 | 0.94 | 660 | 0.0327 |
0.0161 | 0.97 | 680 | 0.0386 |
0.0038 | 1.0 | 700 | 0.0377 |
0.0029 | 1.02 | 720 | 0.0367 |
0.0164 | 1.05 | 740 | 0.0276 |
0.0128 | 1.08 | 760 | 0.0259 |
0.0108 | 1.11 | 780 | 0.0294 |
0.026 | 1.14 | 800 | 0.0285 |
0.0104 | 1.17 | 820 | 0.0297 |
0.0102 | 1.19 | 840 | 0.0271 |
0.0111 | 1.22 | 860 | 0.0293 |
0.0088 | 1.25 | 880 | 0.0305 |
0.0116 | 1.28 | 900 | 0.0250 |
0.0066 | 1.31 | 920 | 0.0442 |
0.0061 | 1.34 | 940 | 0.0309 |
0.0173 | 1.37 | 960 | 0.0231 |
0.0032 | 1.39 | 980 | 0.0230 |
0.0119 | 1.42 | 1000 | 0.0401 |
0.0083 | 1.45 | 1020 | 0.0274 |
0.0047 | 1.48 | 1040 | 0.0359 |
0.0221 | 1.51 | 1060 | 0.0301 |
0.0038 | 1.54 | 1080 | 0.0280 |
0.0052 | 1.56 | 1100 | 0.0235 |
0.0084 | 1.59 | 1120 | 0.0323 |
0.012 | 1.62 | 1140 | 0.0320 |
0.0019 | 1.65 | 1160 | 0.0256 |
0.0175 | 1.68 | 1180 | 0.0300 |
0.0078 | 1.71 | 1200 | 0.0362 |
0.0088 | 1.74 | 1220 | 0.0310 |
0.0065 | 1.76 | 1240 | 0.0301 |
0.0059 | 1.79 | 1260 | 0.0348 |
0.0066 | 1.82 | 1280 | 0.0341 |
0.0015 | 1.85 | 1300 | 0.0280 |
0.0091 | 1.88 | 1320 | 0.0266 |
0.0053 | 1.91 | 1340 | 0.0350 |
0.0077 | 1.93 | 1360 | 0.0333 |
0.0081 | 1.96 | 1380 | 0.0320 |
0.0129 | 1.99 | 1400 | 0.0391 |
0.0082 | 2.02 | 1420 | 0.0388 |
0.008 | 2.05 | 1440 | 0.0212 |
0.0025 | 2.08 | 1460 | 0.0362 |
0.0006 | 2.11 | 1480 | 0.0289 |
0.0034 | 2.13 | 1500 | 0.0347 |
0.0115 | 2.16 | 1520 | 0.0313 |
0.0061 | 2.19 | 1540 | 0.0297 |
0.0065 | 2.22 | 1560 | 0.0335 |
0.0144 | 2.25 | 1580 | 0.0379 |
0.0075 | 2.28 | 1600 | 0.0300 |
0.0093 | 2.3 | 1620 | 0.0322 |
0.0091 | 2.33 | 1640 | 0.0313 |
0.0051 | 2.36 | 1660 | 0.0278 |
0.0046 | 2.39 | 1680 | 0.0294 |
0.0004 | 2.42 | 1700 | 0.0283 |
0.0054 | 2.45 | 1720 | 0.0296 |
0.0034 | 2.48 | 1740 | 0.0337 |
0.0065 | 2.5 | 1760 | 0.0341 |
0.0034 | 2.53 | 1780 | 0.0345 |
0.0114 | 2.56 | 1800 | 0.0371 |
0.0044 | 2.59 | 1820 | 0.0377 |
0.0086 | 2.62 | 1840 | 0.0344 |
0.0065 | 2.65 | 1860 | 0.0332 |
0.0051 | 2.67 | 1880 | 0.0344 |
0.008 | 2.7 | 1900 | 0.0355 |
0.0035 | 2.73 | 1920 | 0.0351 |
0.0065 | 2.76 | 1940 | 0.0352 |
0.0097 | 2.79 | 1960 | 0.0347 |
0.0034 | 2.82 | 1980 | 0.0347 |
0.0054 | 2.84 | 2000 | 0.0348 |
0.0045 | 2.87 | 2020 | 0.0344 |
0.0032 | 2.9 | 2040 | 0.0343 |
0.0072 | 2.93 | 2060 | 0.0342 |
0.0074 | 2.96 | 2080 | 0.0344 |
0.0111 | 2.99 | 2100 | 0.0342 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.0.1+cu117
- Datasets 2.14.6
- Tokenizers 0.14.1