Model save

Browse files

Files changed (4) hide show

README.md +99 -0
all_results.json +9 -0
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,99 @@

+---
+base_model: Qwen/Qwen2-1.5B
+library_name: peft
+license: apache-2.0
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: zephyr-7b-dpo-qlora
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-7b-dpo-qlora
+This model is a fine-tuned version of [Qwen/Qwen2-1.5B](https://huggingface.co/Qwen/Qwen2-1.5B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5488
+- Rewards/chosen: -1.1271
+- Rewards/rejected: -1.7888
+- Rewards/accuracies: 0.7380
+- Rewards/margins: 0.6617
+- Logps/rejected: -483.5208
+- Logps/chosen: -460.1232
+- Logits/rejected: -1.4135
+- Logits/chosen: -1.4626
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 3
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 24
+- total_eval_batch_size: 6
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6873        | 0.0393 | 100  | 0.6862          | 0.0421         | 0.0275           | 0.6587             | 0.0146          | -301.8899      | -343.2013    | -1.9289         | -1.9903       |
+| 0.6613        | 0.0785 | 200  | 0.6587          | -0.0475        | -0.1358          | 0.6707             | 0.0882          | -318.2146      | -352.1655    | -1.9013         | -1.9595       |
+| 0.6358        | 0.1178 | 300  | 0.6395          | -0.2454        | -0.3991          | 0.6871             | 0.1537          | -344.5503      | -371.9539    | -1.8154         | -1.8744       |
+| 0.6277        | 0.1570 | 400  | 0.6237          | -0.5205        | -0.7427          | 0.6976             | 0.2222          | -378.9102      | -399.4672    | -1.8111         | -1.8699       |
+| 0.5933        | 0.1963 | 500  | 0.6018          | -0.6962        | -1.0371          | 0.6931             | 0.3410          | -408.3534      | -417.0301    | -1.7721         | -1.8287       |
+| 0.5665        | 0.2355 | 600  | 0.5955          | -0.6340        | -1.0330          | 0.6931             | 0.3989          | -407.9362      | -410.8186    | -1.7701         | -1.8241       |
+| 0.5322        | 0.2748 | 700  | 0.5795          | -0.7405        | -1.2137          | 0.7111             | 0.4732          | -426.0080      | -421.4653    | -1.7116         | -1.7650       |
+| 0.616         | 0.3141 | 800  | 0.5720          | -0.7566        | -1.2468          | 0.7186             | 0.4902          | -429.3149      | -423.0749    | -1.6310         | -1.6828       |
+| 0.6129        | 0.3533 | 900  | 0.5755          | -0.4970        | -0.9648          | 0.7290             | 0.4677          | -401.1144      | -397.1149    | -1.6471         | -1.6991       |
+| 0.5308        | 0.3926 | 1000 | 0.5657          | -1.1354        | -1.7018          | 0.7186             | 0.5664          | -474.8171      | -460.9562    | -1.5510         | -1.6002       |
+| 0.589         | 0.4318 | 1100 | 0.5631          | -1.1476        | -1.7335          | 0.7201             | 0.5859          | -477.9911      | -462.1784    | -1.5444         | -1.5931       |
+| 0.5694        | 0.4711 | 1200 | 0.5629          | -1.0450        | -1.6220          | 0.7246             | 0.5770          | -466.8436      | -451.9160    | -1.5333         | -1.5828       |
+| 0.5809        | 0.5104 | 1300 | 0.5587          | -0.9745        | -1.5915          | 0.7275             | 0.6170          | -463.7866      | -444.8671    | -1.4997         | -1.5489       |
+| 0.5597        | 0.5496 | 1400 | 0.5535          | -1.1201        | -1.7240          | 0.7380             | 0.6039          | -477.0389      | -459.4294    | -1.4968         | -1.5439       |
+| 0.5964        | 0.5889 | 1500 | 0.5565          | -0.8900        | -1.4799          | 0.7350             | 0.5899          | -452.6324      | -436.4146    | -1.4828         | -1.5311       |
+| 0.5329        | 0.6281 | 1600 | 0.5533          | -1.0959        | -1.7399          | 0.7365             | 0.6440          | -478.6324      | -457.0049    | -1.4628         | -1.5115       |
+| 0.5701        | 0.6674 | 1700 | 0.5520          | -1.1059        | -1.7733          | 0.7425             | 0.6673          | -481.9651      | -458.0073    | -1.4578         | -1.5061       |
+| 0.5522        | 0.7066 | 1800 | 0.5523          | -1.0511        | -1.7159          | 0.7380             | 0.6648          | -476.2304      | -452.5267    | -1.4461         | -1.4951       |
+| 0.5659        | 0.7459 | 1900 | 0.5553          | -0.9300        | -1.5725          | 0.7365             | 0.6425          | -461.8892      | -440.4130    | -1.4492         | -1.4980       |
+| 0.5375        | 0.7852 | 2000 | 0.5503          | -1.1096        | -1.7660          | 0.7440             | 0.6564          | -481.2357      | -458.3737    | -1.4278         | -1.4768       |
+| 0.5836        | 0.8244 | 2100 | 0.5494          | -1.1522        | -1.8216          | 0.7395             | 0.6694          | -486.8011      | -462.6367    | -1.4142         | -1.4632       |
+| 0.5282        | 0.8637 | 2200 | 0.5488          | -1.1628        | -1.8230          | 0.7365             | 0.6602          | -486.9384      | -463.6924    | -1.4117         | -1.4607       |
+| 0.5604        | 0.9029 | 2300 | 0.5487          | -1.1347        | -1.7969          | 0.7380             | 0.6621          | -484.3240      | -460.8886    | -1.4144         | -1.4635       |
+| 0.5365        | 0.9422 | 2400 | 0.5488          | -1.1196        | -1.7811          | 0.7380             | 0.6615          | -482.7509      | -459.3745    | -1.4142         | -1.4633       |
+| 0.5135        | 0.9815 | 2500 | 0.5488          | -1.1271        | -1.7888          | 0.7380             | 0.6617          | -483.5208      | -460.1232    | -1.4135         | -1.4626       |
+### Framework versions
+- PEFT 0.12.0
+- Transformers 4.44.2
+- Pytorch 2.1.2
+- Datasets 3.0.0
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 0.9999018549416037,
+    "total_flos": 0.0,
+    "train_loss": 0.5787653771390342,
+    "train_runtime": 28047.1443,
+    "train_samples": 61134,
+    "train_samples_per_second": 2.18,
+    "train_steps_per_second": 0.091
+}

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 0.9999018549416037,
+    "total_flos": 0.0,
+    "train_loss": 0.5787653771390342,
+    "train_runtime": 28047.1443,
+    "train_samples": 61134,
+    "train_samples_per_second": 2.18,
+    "train_steps_per_second": 0.091
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff