willtensora commited on
Commit
f30ff91
·
verified ·
1 Parent(s): 94bebf0

End of training

Browse files
Files changed (2) hide show
  1. README.md +6 -9
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -99,7 +99,7 @@ xformers_attention: null
99
 
100
  This model is a fine-tuned version of [peft-internal-testing/tiny-dummy-qwen2](https://huggingface.co/peft-internal-testing/tiny-dummy-qwen2) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
- - Loss: 11.9312
103
 
104
  ## Model description
105
 
@@ -122,11 +122,8 @@ The following hyperparameters were used during training:
122
  - train_batch_size: 2
123
  - eval_batch_size: 2
124
  - seed: 42
125
- - distributed_type: multi-GPU
126
- - num_devices: 2
127
  - gradient_accumulation_steps: 4
128
- - total_train_batch_size: 16
129
- - total_eval_batch_size: 4
130
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
131
  - lr_scheduler_type: cosine
132
  - lr_scheduler_warmup_steps: 10
@@ -136,10 +133,10 @@ The following hyperparameters were used during training:
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
- | 11.9305 | 0.0011 | 1 | 11.9313 |
140
- | 11.9253 | 0.0034 | 3 | 11.9313 |
141
- | 11.9317 | 0.0067 | 6 | 11.9313 |
142
- | 11.9306 | 0.0101 | 9 | 11.9312 |
143
 
144
 
145
  ### Framework versions
 
99
 
100
  This model is a fine-tuned version of [peft-internal-testing/tiny-dummy-qwen2](https://huggingface.co/peft-internal-testing/tiny-dummy-qwen2) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
+ - Loss: 11.9313
103
 
104
  ## Model description
105
 
 
122
  - train_batch_size: 2
123
  - eval_batch_size: 2
124
  - seed: 42
 
 
125
  - gradient_accumulation_steps: 4
126
+ - total_train_batch_size: 8
 
127
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
128
  - lr_scheduler_type: cosine
129
  - lr_scheduler_warmup_steps: 10
 
133
 
134
  | Training Loss | Epoch | Step | Validation Loss |
135
  |:-------------:|:------:|:----:|:---------------:|
136
+ | 11.9315 | 0.0006 | 1 | 11.9313 |
137
+ | 11.9319 | 0.0017 | 3 | 11.9313 |
138
+ | 11.926 | 0.0034 | 6 | 11.9313 |
139
+ | 11.9287 | 0.0050 | 9 | 11.9313 |
140
 
141
 
142
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:02e508a82b48161bd5428b10fe747071ffa764826590d87a2c4eee2cc42f49d7
3
  size 21378
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc473f88f862e4189f089a6808881b152bf109f340ee467732bb6526c1e2c1e2
3
  size 21378