Edit model card

zephyr-7b-sft-lora-accum4-lr1e_5

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7880

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Validation Loss
2.041 0.55 13 1.9551
1.8921 1.57 27 1.8206
1.7735 2.55 40 1.7331
1.7122 3.57 54 1.6613
1.63 4.55 67 1.5922
1.5552 5.57 81 1.5170
1.4953 6.55 94 1.4338
1.3904 7.57 108 1.3306
1.2882 8.55 121 1.2588
1.2282 9.57 135 1.2057
1.2016 10.55 148 1.1698
1.1707 11.57 162 1.1430
1.1417 12.55 175 1.1254
1.1281 13.57 189 1.1089
1.1005 14.55 202 1.0957
1.101 15.57 216 1.0832
1.0825 16.55 229 1.0708
1.0743 17.57 243 1.0605
1.0642 18.55 256 1.0508
1.0486 19.57 270 1.0392
1.0383 20.55 283 1.0343
1.0281 21.57 297 1.0229
1.0129 22.55 310 1.0152
0.9941 23.57 324 1.0072
1.0026 24.55 337 0.9972
0.9769 25.57 351 0.9888
0.9641 26.55 364 0.9800
0.9636 27.57 378 0.9725
0.9503 28.55 391 0.9626
0.9463 29.57 405 0.9541
0.9194 30.55 418 0.9462
0.9213 31.57 432 0.9374
0.8971 32.55 445 0.9283
0.8972 33.57 459 0.9160
0.872 34.55 472 0.9080
0.8633 35.57 486 0.9000
0.8565 36.55 499 0.8920
0.8479 37.57 513 0.8804
0.8332 38.55 526 0.8737
0.8216 39.57 540 0.8650
0.8112 40.55 553 0.8568
0.7988 41.57 567 0.8463
0.7862 42.55 580 0.8370
0.7675 43.57 594 0.8311
0.761 44.55 607 0.8224
0.757 45.57 621 0.8158
0.7426 46.55 634 0.8089
0.7303 47.57 648 0.8007
0.7169 48.55 661 0.7946
0.7249 49.57 675 0.7884

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for shkang/zephyr-7b-sft-lora-accum4-lr1e_5

Finetuned
(692)
this model