ryota39
/

mluke-large-lite-reward

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

ryota39 commited on Jul 4

Commit

bd7427e

•

1 Parent(s): 90e516a

Update README.md

Files changed (1) hide show

README.md +16 -17

README.md CHANGED Viewed

@@ -16,30 +16,29 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/rspeech3399/huggingface/runs/uv90lda8)
-# out
-This model is a fine-tuned version of [studio-ousia/mluke-large-lite](https://huggingface.co/studio-ousia/mluke-large-lite) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.1615
-- Accuracy: 0.9399
-- Precision: 0.9346
-- Recall: 0.9460
-- F1: 0.9403
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+## Fine-tuning
+- this model was trained to classify whether input text comes from "chosen sentence" or "rejected sentence"
+- the probability (logits after passing softmax function) in last layer of this model can be used to quantify the preference from user input
+- fine-tuned [studio-ousia/mluke-large-lite](https://huggingface.co/studio-ousia/mluke-large-lite) via full parameter tuning using [open-preference-v0.3](https://huggingface.co/datasets/ryota39/open_preference-v0.3)
+- trained on bf16 format
+## Metric
+- train and validation split
+|train loss|eval loss|accuracy|recall|precision|f1-score|
+|:---|:---|:---|:---|:---|:---|
+|0.114|0.1615|0.9399|0.9459|0.9346|0.9402|
+- test split
+|accuracy|recall|precision|f1-score|
+|:---|:---|:---|:---|
+|0.9416|0.9319|0.9504|0.9411|
+- confusion matrix when test split
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/651e3f30ca333f3c8df692b8/00ONMe0qlqv7XB14ttrPY.png)
 ### Training hyperparameters