ryota39's picture
Update README.md
bd7427e verified
|
raw
history blame
2.18 kB
metadata
license: apache-2.0
base_model: studio-ousia/mluke-large-lite
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - precision
  - recall
  - f1
model-index:
  - name: out
    results: []

Fine-tuning

  • this model was trained to classify whether input text comes from "chosen sentence" or "rejected sentence"
  • the probability (logits after passing softmax function) in last layer of this model can be used to quantify the preference from user input
  • fine-tuned studio-ousia/mluke-large-lite via full parameter tuning using open-preference-v0.3
  • trained on bf16 format

Metric

  • train and validation split
train loss eval loss accuracy recall precision f1-score
0.114 0.1615 0.9399 0.9459 0.9346 0.9402
  • test split
accuracy recall precision f1-score
0.9416 0.9319 0.9504 0.9411
  • confusion matrix when test split

image/png

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Recall F1
0.4109 1.0 1479 0.2462 0.9003 0.8710 0.9399 0.9041
0.1579 2.0 2958 0.1573 0.9399 0.9495 0.9293 0.9393
0.114 3.0 4437 0.1615 0.9399 0.9346 0.9460 0.9403

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.1.0+cu118
  • Datasets 2.20.0
  • Tokenizers 0.19.1