cyberagent
/

calm2-7b-chat-dpo-experimental

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ddyuudd commited on Jan 23

Commit

3b1e6d2

•

1 Parent(s): 1bafe89

Create README.md

Files changed (1) hide show

README.md +40 -0

README.md ADDED Viewed

	@@ -0,0 +1,40 @@

+---
+license: cc-by-4.0
+datasets:
+- ddyuudd/chatbot_arena_ja_calm2-7b-chat-experimental
+language:
+- ja
+---
+# Model Card for "calm2-7b-chat-dpo-experimental"
+### ELYZA-tasks-100 (GPT-4 eval)
+実験結果のランダム性を避けるため、greedy searchで出力しました。
+| calm2-7b-chat | calm2-7b-chat-dpo |
+| ---- | ---- |
+| 2.67 | 2.85 |
+### Japanese MT-Bench
+以下の文をシステムプロンプト（system_message）としてcalm2-7b-chat-dpoとcalm2-7b-chatの評価を行いました。
+"以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。"
+このシステムプロンプトはstabilityai/japanese-stablelm-instruct-alpha-7bをJapanese MT-Benchで評価する場合に使われるものです。
+他のデコーディングパラメータはデフォルトのままです。
+| | calm2-7b-chat | calm2-7b-chat-dpo |
+| ---- | ---- | ---- |
+| MEAN | 6.1 | 6.7 |
+| extraction |	4.1	| 5.4 |
+| humanities	| 8.2	| 8.4 |
+| reasoning	| 3.9	| 4.3 |
+| roleplay	| 6.4	| 7.0 |
+| stem	| 6.3	| 6.2 |
+| writing	| 7.7	| 9.1 |