willtensora commited on
Commit
811556f
·
verified ·
1 Parent(s): 7d72fac

End of training

Browse files
Files changed (2) hide show
  1. README.md +8 -5
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -122,8 +122,11 @@ The following hyperparameters were used during training:
122
  - train_batch_size: 2
123
  - eval_batch_size: 2
124
  - seed: 42
 
 
125
  - gradient_accumulation_steps: 4
126
- - total_train_batch_size: 8
 
127
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
128
  - lr_scheduler_type: cosine
129
  - lr_scheduler_warmup_steps: 10
@@ -133,10 +136,10 @@ The following hyperparameters were used during training:
133
 
134
  | Training Loss | Epoch | Step | Validation Loss |
135
  |:-------------:|:------:|:----:|:---------------:|
136
- | 11.9315 | 0.0006 | 1 | 11.9313 |
137
- | 11.9319 | 0.0017 | 3 | 11.9313 |
138
- | 11.926 | 0.0034 | 6 | 11.9313 |
139
- | 11.9287 | 0.0050 | 9 | 11.9313 |
140
 
141
 
142
  ### Framework versions
 
122
  - train_batch_size: 2
123
  - eval_batch_size: 2
124
  - seed: 42
125
+ - distributed_type: multi-GPU
126
+ - num_devices: 2
127
  - gradient_accumulation_steps: 4
128
+ - total_train_batch_size: 16
129
+ - total_eval_batch_size: 4
130
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
131
  - lr_scheduler_type: cosine
132
  - lr_scheduler_warmup_steps: 10
 
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
+ | 11.9305 | 0.0011 | 1 | 11.9313 |
140
+ | 11.9253 | 0.0034 | 3 | 11.9313 |
141
+ | 11.9317 | 0.0067 | 6 | 11.9313 |
142
+ | 11.9306 | 0.0101 | 9 | 11.9313 |
143
 
144
 
145
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1ad2baf62aac9cf16a2ff4bee7c348002296775c72ff07db09288b398142cba6
3
  size 21378
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50d840eb0345ce290f72feeb0ff3ce38889437ad042f185ccc81852d76a8bbc6
3
  size 21378