teamtrack-ai / README.md
thobuiq's picture
Update README.md
ebfbed9 verified
|
raw
history blame
2.85 kB
metadata
license: apache-2.0
base_model: TheBloke/OpenHermes-2-Mistral-7B-GPTQ
tags:
  - trl
  - dpo
  - generated_from_trainer
model-index:
  - name: teamtrack-ai
    results: []
pipeline_tag: text-generation

teamtrack-ai

This model is a fine-tuned version of TheBloke/OpenHermes-2-Mistral-7B-GPTQ on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6303
  • Rewards/chosen: -0.0503
  • Rewards/rejected: -0.1912
  • Rewards/accuracies: 0.875
  • Rewards/margins: 0.1409
  • Logps/rejected: -190.9696
  • Logps/chosen: -89.6439
  • Logits/rejected: -2.7104
  • Logits/chosen: -2.8594

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • training_steps: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.6859 0.01 10 0.6664 0.0030 -0.0376 0.6875 0.0406 -189.4330 -89.1108 -2.7111 -2.8122
0.6888 0.01 20 0.6478 -0.0110 -0.0944 0.875 0.0834 -190.0014 -89.2510 -2.7160 -2.8235
0.6397 0.01 30 0.6385 -0.0256 -0.1254 0.8125 0.0997 -190.3110 -89.3974 -2.7148 -2.8392
0.6501 0.02 40 0.6365 -0.0472 -0.1782 0.8125 0.1311 -190.8396 -89.6128 -2.7116 -2.8528
0.6852 0.03 50 0.6303 -0.0503 -0.1912 0.875 0.1409 -190.9696 -89.6439 -2.7104 -2.8594

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.0