File size: 4,011 Bytes

3b1e6d2
 
 
d55a77e
3b1e6d2
 
43a0ae0
7cbc4df
3b1e6d2
 
 
 
d55a77e
99b5ca6
43a0ae0
 
 
 
9cf2255
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43a0ae0
 
3b1e6d2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
05ddbc3
43a0ae0
3b1e6d2
 
 
43a0ae0
3b1e6d2
 
 
 
 
 
 
9cf2255
 
 
43a0ae0
 
 
f4ae397
 
 
aba3f87
 
 
f4ae397
 
aba3f87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f4ae397

---
license: cc-by-4.0
datasets:
- cyberagent/chatbot-arena-ja-calm2-7b-chat-experimental
language:
- ja
- en
base_model: "cyberagent/calm2-7b-chat"
---

# Model Card for "calm2-7b-chat-dpo-experimental"

[cyberagent/calm2-7b-chat](https://huggingface.co/cyberagent/calm2-7b-chat)に[cyberagent/chatbot-arena-ja-calm2-7b-chat-experimental](https://huggingface.co/datasets/cyberagent/chatbot-arena-ja-calm2-7b-chat-experimental)データセットを用いて[Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290)をしたモデルです。
DPOには[Low-Rank Adaptation (LoRA)](https://huggingface.co/docs/peft/conceptual_guides/lora)を用いました。

## Requirements, Usage, Chat Template

[cyberagent/calm2-7b-chat](https://huggingface.co/cyberagent/calm2-7b-chat)と同様です。
同様のコード・プロンプトで動かすことができます。

```python
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

assert transformers.__version__ >= "4.34.1"

model = AutoModelForCausalLM.from_pretrained("cyberagent/calm2-7b-chat-dpo-experimental", device_map="auto", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("cyberagent/calm2-7b-chat-dpo-experimental")
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

prompt = """USER: AIによって私達の暮らしはどのように変わりますか？
ASSISTANT: """

token_ids = tokenizer.encode(prompt, return_tensors="pt")
output_ids = model.generate(
    input_ids=token_ids.to(model.device),
    max_new_tokens=300,
    do_sample=True,
    temperature=0.8,
    streamer=streamer,
)
```

## 実験結果

### ELYZA-tasks-100 (GPT-4 eval)

実験結果のランダム性を避けるため、greedy searchで出力しました。

| calm2-7b-chat | calm2-7b-chat-dpo |
| ---- | ---- | 
| 2.67 | 2.85 |


### Japanese MT-Bench

以下の文をシステムプロンプト（system_message）としてcalm2-7b-chat-dpoとcalm2-7b-chatの評価を行いました。

"以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。"

このシステムプロンプトは[stabilityai/japanese-stablelm-instruct-alpha-7bを評価するときに使われるもの](https://github.com/Stability-AI/FastChat/blob/dfb653d2cadd16017b66bbc3a25cf361031f2da3/fastchat/conversation.py#L364)をそのまま使いました。
他のデコーディングパラメータはデフォルトのままです（ランダム性があります）。

| | calm2-7b-chat | calm2-7b-chat-dpo |
| ---- | ---- | ---- | 
| 平均 | 6.1 | 6.7 |
| extraction |	4.1	| 5.4 |
| humanities	| 8.2	| 8.4 |
| reasoning	| 3.9	| 4.3 |
| roleplay	| 6.4	| 7.0 |
| stem	| 6.3	| 6.2 |
| writing	| 7.7	| 9.1 |

## Releases

1.0: v1 release (Jan 24, 2024)

## Author

Yuu Jinnai (jinnai_yu@cyberagent.co.jp), Standing on the shoulders of giants

## Reference
本モデルの詳細は以下の論文を参照ください。

[Yuu Jinnai. 2024. Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models?. In Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP, pages 48–64, Bangkok, Thailand. Association for Computational Linguistics.](https://aclanthology.org/2024.c3nlp-1.5/)

```tex
@inproceedings{jinnai-2024-cross,
    title = "Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models?",
    author = "Jinnai, Yuu",
    editor = "Prabhakaran, Vinodkumar  and
      Dev, Sunipa  and
      Benotti, Luciana  and
      Hershcovich, Daniel  and
      Cabello, Laura  and
      Cao, Yong  and
      Adebara, Ife  and
      Zhou, Li",
    booktitle = "Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.c3nlp-1.5",
    pages = "48--64",
}
```