Muthusivam commited on
Commit
f750f9c
1 Parent(s): 0226cf0

End of training

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # llama2_finetuned_muthu
15
 
16
- This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GPTQ) on an unknown dataset.
17
 
18
  ## Model description
19
 
@@ -36,9 +36,11 @@ The following hyperparameters were used during training:
36
  - train_batch_size: 8
37
  - eval_batch_size: 8
38
  - seed: 42
 
 
39
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
  - lr_scheduler_type: cosine
41
- - training_steps: 10
42
  - mixed_precision_training: Native AMP
43
 
44
  ### Training results
 
13
 
14
  # llama2_finetuned_muthu
15
 
16
+ This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.1-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GPTQ) on the None dataset.
17
 
18
  ## Model description
19
 
 
36
  - train_batch_size: 8
37
  - eval_batch_size: 8
38
  - seed: 42
39
+ - gradient_accumulation_steps: 4
40
+ - total_train_batch_size: 32
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: cosine
43
+ - training_steps: 250
44
  - mixed_precision_training: Native AMP
45
 
46
  ### Training results