Edit model card

zephyr-7b-sft-lora-accum4-lr5e_5-30

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5210

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 30.0

Training results

Training Loss Epoch Step Validation Loss
1.8242 0.55 13 1.6628
1.5291 1.57 27 1.3837
1.2362 2.55 40 1.1786
1.1401 3.57 54 1.0978
1.0666 4.55 67 1.0428
1.021 5.57 81 1.0063
0.9852 6.55 94 0.9629
0.9317 7.57 108 0.9246
0.8735 8.55 121 0.8885
0.8147 9.57 135 0.8335
0.763 10.55 148 0.7860
0.6926 11.57 162 0.7296
0.6332 12.55 175 0.6863
0.5898 13.57 189 0.6491
0.536 14.55 202 0.6180
0.5263 15.57 216 0.5764
0.5071 16.55 229 0.5683
0.4756 17.57 243 0.5597
0.4693 18.55 256 0.5342
0.4414 19.57 270 0.5386
0.4266 20.55 283 0.5346
0.4286 21.57 297 0.5155
0.4256 22.55 310 0.5108
0.418 23.57 324 0.5230
0.407 24.55 337 0.5165
0.411 25.57 351 0.5128
0.3896 26.55 364 0.5027
0.39 27.57 378 0.5063
0.3928 28.55 391 0.4946
0.3818 29.57 405 0.5063

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for shkang/zephyr-7b-sft-lora-accum4-lr5e_5-30

Finetuned
(692)
this model