Myashka
/

gpt-imdb-ipo-beta_0.5

@@ -13,6 +13,16 @@ should probably proofread and complete it, then remove this comment. -->
 # gpt-imdb-ipo-beta_0.5
 This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
 ## Model description
@@ -40,6 +50,26 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_steps: 150
 - num_epochs: 3
 ### Framework versions
 - Transformers 4.35.2

 # gpt-imdb-ipo-beta_0.5
 This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.9628
+- Rewards/chosen: -0.4934
+- Rewards/rejected: -0.8358
+- Rewards/accuracies: 0.7812
+- Rewards/margins: 0.3424
+- Logps/rejected: -265.3568
+- Logps/chosen: -236.2520
+- Logits/rejected: -32.5835
+- Logits/chosen: -32.6621
 ## Model description
 - lr_scheduler_warmup_steps: 150
 - num_epochs: 3
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 10.732        | 0.21  | 500  | 21.6330         | -0.2465        | -0.4751          | 0.5792             | 0.2286          | -264.6355      | -235.7583    | -34.3644        | -34.6229      |
+| 11.0252       | 0.42  | 1000 | 17.5281         | 0.3734         | 0.1008           | 0.5437             | 0.2726          | -263.4837      | -234.5185    | -35.1543        | -35.3784      |
+| 17.5294       | 0.63  | 1500 | 18.4782         | -0.4521        | -0.6725          | 0.6208             | 0.2203          | -265.0302      | -236.1696    | -33.9319        | -34.0933      |
+| 7.8398        | 0.83  | 2000 | 17.4130         | -0.5472        | -0.6406          | 0.6083             | 0.0933          | -264.9664      | -236.3597    | -34.0128        | -34.1803      |
+| 6.2214        | 1.04  | 2500 | 9.4072          | -0.5101        | -0.8182          | 0.6292             | 0.3080          | -265.3216      | -236.2855    | -33.2396        | -33.3578      |
+| 9.8652        | 1.25  | 3000 | 13.4878         | -0.6413        | -0.8801          | 0.6375             | 0.2388          | -265.4454      | -236.5479    | -32.0018        | -32.1655      |
+| 11.4779       | 1.46  | 3500 | 7.5245          | -0.0755        | -0.3944          | 0.6750             | 0.3189          | -264.4740      | -235.4162    | -32.8982        | -33.0074      |
+| 3.9833        | 1.67  | 4000 | 4.4888          | -0.7021        | -1.0680          | 0.6729             | 0.3659          | -265.8214      | -236.6695    | -32.9502        | -33.0304      |
+| 3.389         | 1.88  | 4500 | 3.9317          | -0.5045        | -0.8887          | 0.7271             | 0.3841          | -265.4626      | -236.2743    | -32.7817        | -32.8828      |
+| 3.2338        | 2.08  | 5000 | 2.4116          | -0.5185        | -0.8672          | 0.7146             | 0.3487          | -265.4196      | -236.3022    | -32.5025        | -32.5681      |
+| 1.2381        | 2.29  | 5500 | 2.1558          | -0.5066        | -0.8815          | 0.7458             | 0.3749          | -265.4483      | -236.2784    | -32.3108        | -32.3902      |
+| 1.6263        | 2.5   | 6000 | 1.1972          | -0.5280        | -0.8664          | 0.7396             | 0.3384          | -265.4182      | -236.3213    | -32.5356        | -32.6104      |
+| 1.0882        | 2.71  | 6500 | 1.1163          | -0.5303        | -0.8584          | 0.7562             | 0.3281          | -265.4022      | -236.3259    | -32.5615        | -32.6406      |
+| 1.0559        | 2.92  | 7000 | 0.9628          | -0.4934        | -0.8358          | 0.7812             | 0.3424          | -265.3568      | -236.2520    | -32.5835        | -32.6621      |
 ### Framework versions
 - Transformers 4.35.2