Myashka's picture
End of training
e71fab4
|
raw
history blame
4.44 kB
metadata
base_model: lvwerra/gpt2-imdb
tags:
  - generated_from_trainer
model-index:
  - name: gpt-imdb-ipo-beta_0.5
    results: []

gpt-imdb-ipo-beta_0.5

This model is a fine-tuned version of lvwerra/gpt2-imdb on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9628
  • Rewards/chosen: -0.4934
  • Rewards/rejected: -0.8358
  • Rewards/accuracies: 0.7812
  • Rewards/margins: 0.3424
  • Logps/rejected: -265.3568
  • Logps/chosen: -236.2520
  • Logits/rejected: -32.5835
  • Logits/chosen: -32.6621

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 150
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
10.732 0.21 500 21.6330 -0.2465 -0.4751 0.5792 0.2286 -264.6355 -235.7583 -34.3644 -34.6229
11.0252 0.42 1000 17.5281 0.3734 0.1008 0.5437 0.2726 -263.4837 -234.5185 -35.1543 -35.3784
17.5294 0.63 1500 18.4782 -0.4521 -0.6725 0.6208 0.2203 -265.0302 -236.1696 -33.9319 -34.0933
7.8398 0.83 2000 17.4130 -0.5472 -0.6406 0.6083 0.0933 -264.9664 -236.3597 -34.0128 -34.1803
6.2214 1.04 2500 9.4072 -0.5101 -0.8182 0.6292 0.3080 -265.3216 -236.2855 -33.2396 -33.3578
9.8652 1.25 3000 13.4878 -0.6413 -0.8801 0.6375 0.2388 -265.4454 -236.5479 -32.0018 -32.1655
11.4779 1.46 3500 7.5245 -0.0755 -0.3944 0.6750 0.3189 -264.4740 -235.4162 -32.8982 -33.0074
3.9833 1.67 4000 4.4888 -0.7021 -1.0680 0.6729 0.3659 -265.8214 -236.6695 -32.9502 -33.0304
3.389 1.88 4500 3.9317 -0.5045 -0.8887 0.7271 0.3841 -265.4626 -236.2743 -32.7817 -32.8828
3.2338 2.08 5000 2.4116 -0.5185 -0.8672 0.7146 0.3487 -265.4196 -236.3022 -32.5025 -32.5681
1.2381 2.29 5500 2.1558 -0.5066 -0.8815 0.7458 0.3749 -265.4483 -236.2784 -32.3108 -32.3902
1.6263 2.5 6000 1.1972 -0.5280 -0.8664 0.7396 0.3384 -265.4182 -236.3213 -32.5356 -32.6104
1.0882 2.71 6500 1.1163 -0.5303 -0.8584 0.7562 0.3281 -265.4022 -236.3259 -32.5615 -32.6406
1.0559 2.92 7000 0.9628 -0.4934 -0.8358 0.7812 0.3424 -265.3568 -236.2520 -32.5835 -32.6621

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1
  • Datasets 2.15.0
  • Tokenizers 0.15.0