Sakura-SOLAR-Instruct-DPO-v2

(주)미디어그룹사람과숲과 (주)마커의 LLM 연구 컨소시엄에서 개발된 모델입니다

Model Details

Model Developers Kyujin Han (kyujinpy)

Method
Using DPO method.
With argilla/distilabel-math-preference-dpo.

I shared the information about my model. (training and code)
Please see: ⭐Sakura-SOLAR.

Model Benchmark

Open leaderboard

  • Follow up as link.
Model Average ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K
Sakura-SOLAR-Instruct-DPO-v2 NaN NaN NaN NaN NaN NaN NaN
Sakura-SOLAR-Instruct-DPO-v1 NaN NaN NaN NaN NaN NaN NaN
kyujinpy/Sakura-SOLAR-Instruct 74.40 70.99 88.42 66.33 71.79 83.66 65.20

Implementation Code

### KO-Platypus
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo = "kyujinpy/Sakura-SOLAR-Instruct-DPO-v2"
OpenOrca = AutoModelForCausalLM.from_pretrained(
        repo,
        return_dict=True,
        torch_dtype=torch.float16,
        device_map='auto'
)
OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)

Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train LoneStriker/Sakura-SOLAR-Instruct-DPO-v2-5.0bpw-h6-exl2