Kedar84 commited on
Commit
e0fad61
1 Parent(s): d516da5

End of training

Browse files
Files changed (1) hide show
  1. README.md +8 -28
README.md CHANGED
@@ -2,12 +2,13 @@
2
  license: apache-2.0
3
  library_name: peft
4
  tags:
 
 
5
  - generated_from_trainer
6
  base_model: TheBloke/Mistral-7B-Instruct-v0.1-GPTQ
7
  model-index:
8
  - name: mistral-finetuned-samsum
9
  results: []
10
- pipeline_tag: text-generation
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -29,38 +30,17 @@ More information needed
29
 
30
  More information needed
31
 
32
-
33
  ## Training procedure
34
 
35
-
36
- The following `bitsandbytes` quantization config was used during training:
37
- - quant_method: gptq
38
- - bits: 4
39
- - tokenizer: None
40
- - dataset: None
41
- - group_size: 128
42
- - damp_percent: 0.1
43
- - desc_act: True
44
- - sym: True
45
- - true_sequential: True
46
- - use_cuda_fp16: False
47
- - model_seqlen: None
48
- - block_name_to_quantize: None
49
- - module_name_preceding_first_block: None
50
- - batch_size: 1
51
- - pad_token_id: None
52
- - use_exllama: False
53
- - max_input_length: None
54
- - exllama_config: {'version': <ExllamaVersion.ONE: 1>}
55
- - cache_block_outputs: True
56
-
57
  ### Training hyperparameters
58
 
59
  The following hyperparameters were used during training:
60
  - learning_rate: 0.0002
61
- - train_batch_size: 8
62
  - eval_batch_size: 8
63
  - seed: 42
 
 
64
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
  - lr_scheduler_type: cosine
66
  - training_steps: 250
@@ -72,8 +52,8 @@ The following hyperparameters were used during training:
72
 
73
  ### Framework versions
74
 
75
- - PEFT 0.7.0
76
- - Transformers 4.36.0.dev0
77
  - Pytorch 2.1.0+cu118
78
- - Datasets 2.15.0
79
  - Tokenizers 0.15.0
 
2
  license: apache-2.0
3
  library_name: peft
4
  tags:
5
+ - trl
6
+ - sft
7
  - generated_from_trainer
8
  base_model: TheBloke/Mistral-7B-Instruct-v0.1-GPTQ
9
  model-index:
10
  - name: mistral-finetuned-samsum
11
  results: []
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
30
 
31
  More information needed
32
 
 
33
  ## Training procedure
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0002
39
+ - train_batch_size: 6
40
  - eval_batch_size: 8
41
  - seed: 42
42
+ - gradient_accumulation_steps: 4
43
+ - total_train_batch_size: 24
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: cosine
46
  - training_steps: 250
 
52
 
53
  ### Framework versions
54
 
55
+ - PEFT 0.7.1
56
+ - Transformers 4.37.0.dev0
57
  - Pytorch 2.1.0+cu118
58
+ - Datasets 2.16.0
59
  - Tokenizers 0.15.0