Meta-Llama-3-8B-Instruct-rm-Anthropic-hh-rlhf-concateye

Browse files

Files changed (3) hide show

README.md +30 -21
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -18,8 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6613
-- Accuracy: 0.6102
 ## Model description
@@ -38,8 +38,8 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -51,23 +51,32 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.7033        | 0.11  | 64   | 0.6980          | 0.5019   |
-| 0.726         | 0.22  | 128  | 0.6890          | 0.5442   |
-| 0.6971        | 0.34  | 192  | 0.6835          | 0.5430   |
-| 0.7056        | 0.45  | 256  | 0.6792          | 0.5492   |
-| 0.7015        | 0.56  | 320  | 0.6744          | 0.5729   |
-| 0.6752        | 0.67  | 384  | 0.6704          | 0.5841   |
-| 0.69          | 0.79  | 448  | 0.6661          | 0.6052   |
-| 0.7041        | 0.9   | 512  | 0.6651          | 0.6065   |
-| 0.6855        | 1.01  | 576  | 0.6642          | 0.6102   |
-| 0.6825        | 1.12  | 640  | 0.6631          | 0.6152   |
-| 0.6907        | 1.24  | 704  | 0.6626          | 0.6152   |
-| 0.6601        | 1.35  | 768  | 0.6619          | 0.6115   |
-| 0.6887        | 1.46  | 832  | 0.6618          | 0.6127   |
-| 0.6976        | 1.57  | 896  | 0.6612          | 0.6090   |
-| 0.6782        | 1.69  | 960  | 0.6614          | 0.6152   |
-| 0.6653        | 1.8   | 1024 | 0.6616          | 0.6139   |
-| 0.6618        | 1.91  | 1088 | 0.6613          | 0.6102   |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.7036
+- Accuracy: 0.5236
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.7293        | 0.08  | 128  | 0.7252          | 0.4850   |
+| 0.7412        | 0.15  | 256  | 0.6925          | 0.5386   |
+| 0.7182        | 0.23  | 384  | 0.6954          | 0.5327   |
+| 0.6997        | 0.3   | 512  | 0.6941          | 0.5277   |
+| 0.7547        | 0.38  | 640  | 0.6959          | 0.5279   |
+| 0.7123        | 0.45  | 768  | 0.6993          | 0.5252   |
+| 0.7281        | 0.53  | 896  | 0.6962          | 0.5275   |
+| 0.7169        | 0.6   | 1024 | 0.6986          | 0.5156   |
+| 0.7244        | 0.68  | 1152 | 0.6981          | 0.5125   |
+| 0.7199        | 0.75  | 1280 | 0.7000          | 0.5060   |
+| 0.7311        | 0.83  | 1408 | 0.6959          | 0.5140   |
+| 0.7123        | 0.9   | 1536 | 0.6956          | 0.5154   |
+| 0.7344        | 0.98  | 1664 | 0.6970          | 0.5100   |
+| 0.7105        | 1.05  | 1792 | 0.6933          | 0.5219   |
+| 0.6947        | 1.13  | 1920 | 0.6944          | 0.5259   |
+| 0.7261        | 1.21  | 2048 | 0.6960          | 0.5256   |
+| 0.6997        | 1.28  | 2176 | 0.6974          | 0.5188   |
+| 0.7442        | 1.36  | 2304 | 0.6960          | 0.5163   |
+| 0.7004        | 1.43  | 2432 | 0.6987          | 0.5286   |
+| 0.7089        | 1.51  | 2560 | 0.6982          | 0.5288   |
+| 0.7142        | 1.58  | 2688 | 0.7014          | 0.5154   |
+| 0.7364        | 1.66  | 2816 | 0.6997          | 0.5202   |
+| 0.6915        | 1.73  | 2944 | 0.7043          | 0.5200   |
+| 0.7322        | 1.81  | 3072 | 0.7037          | 0.5229   |
+| 0.7524        | 1.88  | 3200 | 0.7019          | 0.5219   |
+| 0.7192        | 1.96  | 3328 | 0.7036          | 0.5236   |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7801c8ce75a629d5a104eb5897a203c0dad1582ffd3513403103ecbd32999684
 size 2115093984

 version https://git-lfs.github.com/spec/v1
+oid sha256:eb0f4897fd7ba92e6d8a5f8984044d163b7b984c0be96e5e9b1bb16361d8b298
 size 2115093984

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ca74357d99bc990aac57594da9a66a7bd708b7907367cd7f3a67374b0002f262
 size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:48428abd1317f678f1e7d88f80b48701fd06496cb790b85bee6ce943ebedbdaa
 size 4664