File size: 8,672 Bytes
28dfa1c d0c2b23 28dfa1c f709a8a 6f2cd78 b625de9 f6e7030 f436e2b f576567 f436e2b f576567 f436e2b f576567 28dfa1c f576567 dbdef5f 3ff512f dbdef5f a33b124 dbdef5f 3c5ad82 a33b124 dbdef5f 3ff512f dbdef5f a33b124 dbdef5f a33b124 dbdef5f 3ff512f dbdef5f a33b124 dbdef5f a33b124 dbdef5f f7fbd24 f205daf f7fbd24 f205daf f7fbd24 c85462e f7fbd24 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
pipeline_tag: text-generation
---
# Model Card for Breeze-7B-Instruct-v0.1
Breeze-7B-Instruct-v0.1 is a 7-billion-parameter language model built from Mistral-7B and tailored for Traditional Chinese (TC).
This model incorporates additional 30k TC tokens in vocabulary dictionary to better adapt to TC and improve inference speed, resulting in a doubling of the original tokenizer's inference speed.
Breeze-7B-Instruct-v0.1 performs well on both EN and TC benchmarks.
This model outperforms Taiwan-LLM-7B-v2.1-chat, Taiwan-LLM-13B-v2.0-chat, and Yi-6B-Chat on major TC benchmarks we tested, and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.
*A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Po-Chun Hsu 許博竣, Feng-Ting Liao 廖峰挺, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*
## Features
- Expanding the vocabulary dictionary for Traditional Chinese from 32k to 62k vocabulary size (the first successful work in Traditional Chinese)
- Multi-turn dialogue without special handling for harmfulness
- 8k context length
- Grouped-query and sliding-window attention
## Model Details
- **Finetuned from:** [MediaTek-Research/Breeze-7B-Base-v0.1](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v0.1)
- **Model type:** Causal decoder-only transformer language model
- **Language:** English and Traditional Chinese (zh-tw)
## Performance
| **Traditional Chinese Benchmarks:** | TMMLU+ (ACC) | DRCD (EM) | MT-Bench-tw (Score) |
|-------------------------------------------------------------------------------------------------------|--------|------|-------------|
| Breeze-7B-Base-v0.1 | | | |
| Breeze-7B-Instruct-v0.1 | | | |
| mistralai/Mistral-7B-v0.1 | | | |
| mistralai/Mistral-7B-Instruct-v0.1 | | | |
| yentinglin/Taiwan-LLM-7B-v2.1-base | | | |
| yentinglin/Taiwan-LLM-7B-v2.1-chat | | | |
| yentinglin/Taiwan-LLM-13B-v2.0-base | | | |
| yentinglin/Taiwan-LLM-13B-v2.0-chat | | | |
| 01-ai/Yi-6B-Base | | | |
| 01-ai/Yi-6B-Chat | | | |
| 01-ai/Yi-34B-Base | | | |
| 01-ai/Yi-34B-Chat | | | |
| Qwen/Qwen-7B | | | |
| Qwen/Qwen-7B-Chat | | | |
| Qwen/Qwen-14B | | | |
| Qwen/Qwen-14B-Chat | | | |
| gpt-3.5-turbo-0613 | | | |
| **English Benchmarks:** | MMLU (ACC) | MT-Bench (Score) |
|-------------------------------------------------------------------------------------------------------|--------|------|
| Breeze-7B-Base-v0.1 | | |
| Breeze-7B-Instruct-v0.1 | | |
| mistralai/Mistral-7B-v0.1 | | |
| mistralai/Mistral-7B-Instruct-v0.1 | | |
| yentinglin/Taiwan-LLM-7B-v2.1-base | | |
| yentinglin/Taiwan-LLM-7B-v2.1-chat | | |
| yentinglin/Taiwan-LLM-13B-v2.0-base | | |
| yentinglin/Taiwan-LLM-13B-v2.0-chat | | |
| 01-ai/Yi-6B-Base | | |
| 01-ai/Yi-6B-Chat | | |
| 01-ai/Yi-34B-Base | | |
| 01-ai/Yi-34B-Chat | | |
| Qwen/Qwen-7B | | | |
| Qwen/Qwen-7B-Chat | | | |
| Qwen/Qwen-14B | | | |
| Qwen/Qwen-14B-Chat | | | |
| gpt-3.5-turbo-0613 | | |
| **Inference Speed Test:** | Speed (char/sec)
|-------------------------------------------------------------------------------------------------------|--------|
| Breeze-7B-Instruct-v0.1 | |
| mistralai/Mistral-7B-Instruct-v0.1 | |
| yentinglin/Taiwan-LLM-7B-v2.1-chat | |
| yentinglin/Taiwan-LLM-13B-v2.0-chat | |
| 01-ai/Yi-6B-Chat | |
| 01-ai/Yi-34B-Chat | |
| Qwen/Qwen-7B-Chat | | | |
| Qwen/Qwen-14B-Chat | | | |
## Use in Transformers
First install direct dependencies:
```
pip install transformers torch accelerate
```
If you want faster inference using flash-attention2, you need to install these dependencies:
```bash
pip install packaging ninja
pip install flash-attn
```
Then load the model in transformers:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
model="MediaTek-Research/Breeze-7B-Instruct-v0.1",
device_map="auto",
torch_dtype=torch.bfloat16,
use_flash_attn_2=True # optional
)
```
The structure of the query template follows that of Mistral-7B-Instruct, as shown below.
```txt
<s> SYS_PROMPT [INST] QUERY1 [/INST] RESPONSE1 [INST] QUERY2 [/INST]
```
where `SYS_PROMPT`, `QUERY1`, `RESPONSE1`, and `QUERY2` can be provided by the user.
The suggested default `SYS_PROMPT` is
```txt
You are a helpful AI assistant bulit by MediaTek Research. The user you helped speaks Traditional Chinese and comes from Taiwan.
```
|