Jlonge4/outputs

Browse files

Files changed (9) hide show

README.md +27 -35
adapter_config.json +6 -6
adapter_model.safetensors +2 -2
runs/Sep08_13-42-39_4e836cbe7489/events.out.tfevents.1725802961.4e836cbe7489.256.0 +3 -0
runs/Sep08_13-46-17_4e836cbe7489/events.out.tfevents.1725803178.4e836cbe7489.256.1 +3 -0
runs/Sep08_13-50-21_4e836cbe7489/events.out.tfevents.1725803422.4e836cbe7489.256.2 +3 -0
runs/Sep08_13-52-53_4e836cbe7489/events.out.tfevents.1725803574.4e836cbe7489.256.3 +3 -0
runs/Sep08_13-53-27_4e836cbe7489/events.out.tfevents.1725803608.4e836cbe7489.256.4 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -14,12 +14,16 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/phi3.5-hallucination/runs/vn7nj2r3)
 # outputs
 This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.5219
 ## Model description
@@ -38,47 +42,35 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 4
 - total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine_with_restarts
 - lr_scheduler_warmup_steps: 20
-- training_steps: 260
 ### Training results
-| Training Loss | Epoch   | Step | Validation Loss |
-|:-------------:|:-------:|:----:|:---------------:|
-| 1.8679        | 1.1429  | 10   | 2.0237          |
-| 1.3961        | 2.2857  | 20   | 1.5707          |
-| 0.939         | 3.4286  | 30   | 1.1184          |
-| 0.9957        | 4.5714  | 40   | 0.9883          |
-| 0.8836        | 5.7143  | 50   | 0.9527          |
-| 0.8069        | 6.8571  | 60   | 0.9431          |
-| 0.6692        | 8.0     | 70   | 0.9416          |
-| 0.7691        | 9.1429  | 80   | 0.9574          |
-| 0.5804        | 10.2857 | 90   | 0.9505          |
-| 0.395         | 11.4286 | 100  | 0.9772          |
-| 0.3864        | 12.5714 | 110  | 1.0155          |
-| 0.3433        | 13.7143 | 120  | 1.0573          |
-| 0.4332        | 14.8571 | 130  | 1.0832          |
-| 0.224         | 16.0    | 140  | 1.1592          |
-| 0.1891        | 17.1429 | 150  | 1.2302          |
-| 0.2235        | 18.2857 | 160  | 1.2603          |
-| 0.1925        | 19.4286 | 170  | 1.3136          |
-| 0.2264        | 20.5714 | 180  | 1.3556          |
-| 0.1491        | 21.7143 | 190  | 1.4057          |
-| 0.2421        | 22.8571 | 200  | 1.4966          |
-| 0.1515        | 24.0    | 210  | 1.4495          |
-| 0.1349        | 25.1429 | 220  | 1.5144          |
-| 0.1493        | 26.2857 | 230  | 1.5340          |
-| 0.1202        | 27.4286 | 240  | 1.5201          |
-| 0.1154        | 28.5714 | 250  | 1.5305          |
-| 0.1968        | 29.7143 | 260  | 1.5219          |
 ### Framework versions

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/phi3.5-hallucination/runs/7sv9jcq1)
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/phi3.5-hallucination/runs/7sv9jcq1)
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/phi3.5-hallucination/runs/dsbpmror)
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/phi3.5-hallucination/runs/tt98djcy)
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/phi3.5-hallucination/runs/tt98djcy)
 # outputs
 This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.2249
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 2
 - total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 20
+- training_steps: 100
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.979         | 0.5882 | 5    | 2.1552          |
+| 1.5649        | 1.1765 | 10   | 1.7044          |
+| 1.5355        | 1.7647 | 15   | 1.3163          |
+| 0.9301        | 2.3529 | 20   | 1.0521          |
+| 0.7935        | 2.9412 | 25   | 0.9929          |
+| 0.6411        | 3.5294 | 30   | 0.9735          |
+| 0.6521        | 4.1176 | 35   | 0.9699          |
+| 0.4867        | 4.7059 | 40   | 0.9812          |
+| 0.6112        | 5.2941 | 45   | 1.0029          |
+| 0.5041        | 5.8824 | 50   | 1.1055          |
+| 0.4784        | 6.4706 | 55   | 1.0859          |
+| 0.3787        | 7.0588 | 60   | 1.1113          |
+| 0.2676        | 7.6471 | 65   | 1.3963          |
+| 0.3066        | 8.2353 | 70   | 1.2249          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -11,22 +11,22 @@
   "layers_to_transform": null,
   "loftq_config": {},
   "lora_alpha": 128,
-  "lora_dropout": 0.2,
   "megatron_config": null,
   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
-  "r": 64,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "up_proj",
     "k_proj",
     "down_proj",
-    "gate_proj",
-    "o_proj",
     "q_proj",
-    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "layers_to_transform": null,
   "loftq_config": {},
   "lora_alpha": 128,
+  "lora_dropout": 0.3,
   "megatron_config": null,
   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "peft_type": "LORA",
+  "r": 32,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v_proj",
     "k_proj",
     "down_proj",
+    "up_proj",
     "q_proj",
+    "gate_proj",
+    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c47f5b555306903f26770a32ba84cdb3afdda4a92f30a5a6a0a28d0268ee956e
-size 142623480

 version https://git-lfs.github.com/spec/v1
+oid sha256:aa0ee5d1cd74a882e54ea6740f67e2475df2f768c58d30859d5bc7041e49462e
+size 71320216

runs/Sep08_13-42-39_4e836cbe7489/events.out.tfevents.1725802961.4e836cbe7489.256.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0dfe3e0b0877de3571cbb070e819a6f089cdcd99a3a573049e93e42bc37dab7d
+size 29510

runs/Sep08_13-46-17_4e836cbe7489/events.out.tfevents.1725803178.4e836cbe7489.256.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7ceeeabaf2ec5ca006ad09508ebc95ae37ec36a1b7ecabbd32bf49e1eadeaa07
+size 35461

runs/Sep08_13-50-21_4e836cbe7489/events.out.tfevents.1725803422.4e836cbe7489.256.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f06b1cc98ff047d6ee0e1afd04da70292bc4dff96dedc7a6326d27fc69a2280c
+size 24032

runs/Sep08_13-52-53_4e836cbe7489/events.out.tfevents.1725803574.4e836cbe7489.256.3 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dcfb60e60be819be38c4fe603b69b6d71a78d270342aafc3029273e8eeee5750
+size 8073

runs/Sep08_13-53-27_4e836cbe7489/events.out.tfevents.1725803608.4e836cbe7489.256.4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:eca1e44fa402959ba33069458393266c73280dc5a6ff23dc0b2b06703ad9f695
+size 26637

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:68156363a28ce3d23f4f6fe3990ca6cd67bf6b5e851dc870c0ea8fd2a80adf41
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:9d6a63621a4deaf4f2435c3108521be660f015cdcbdf0e69d4626b98f36798d3
 size 5432