End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6032
 ## Model description
@@ -38,23 +38,23 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 7
-- eval_batch_size: 7
 - seed: 42
 - gradient_accumulation_steps: 3
-- total_train_batch_size: 21
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.05
-- num_epochs: 2
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.9816        | 0.59  | 300  | 0.6548          |
-| 0.8223        | 1.18  | 600  | 0.6201          |
-| 0.7223        | 1.78  | 900  | 0.6032          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4678
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 3
+- total_train_batch_size: 24
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.2819        | 1.39  | 300  | 0.4803          |
+| 0.2312        | 2.79  | 600  | 0.4819          |
+| 0.2005        | 4.18  | 900  | 0.4678          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,11 +19,11 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "Wqkv",
     "out_proj",
-    "fc2",
     "linear",
-    "fc1"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "fc1",
     "out_proj",
     "linear",
+    "Wqkv",
+    "fc2"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:669affdb816b5482b82bc9551558f5ff3fcd6869e634628da5d36281683759a2
 size 53764520

 version https://git-lfs.github.com/spec/v1
+oid sha256:753e7bfc1b0b5b95674493b1a72e7d095ef12542482a6529d4642be60587e74d
 size 53764520

tokenizer.json CHANGED Viewed

@@ -1,6 +1,11 @@
 {
   "version": "1.0",
-  "truncation": null,
   "padding": null,
   "added_tokens": [
     {

 {
   "version": "1.0",
+  "truncation": {
+    "direction": "Right",
+    "max_length": 256,
+    "strategy": "LongestFirst",
+    "stride": 0
+  },
   "padding": null,
   "added_tokens": [
     {

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:953e9ce9b8f402e54bc6f7c49fdddff21f4aa74e3786b6159b8b19ef03d4da1f
 size 4219

 version https://git-lfs.github.com/spec/v1
+oid sha256:865f5284d663e02e4c82d01e5ef41853179fa7038eda6296ccfc6095b8fefccf
 size 4219