File size: 8,672 Bytes
28dfa1c
 
 
 
d0c2b23
28dfa1c
f709a8a
6f2cd78
b625de9
f6e7030
f436e2b
f576567
 
f436e2b
 
 
f576567
f436e2b
f576567
28dfa1c
 
 
 
 
 
f576567
 
dbdef5f
3ff512f
dbdef5f
 
 
a33b124
 
dbdef5f
 
 
 
3c5ad82
 
 
 
a33b124
 
 
 
 
dbdef5f
 
 
3ff512f
 
dbdef5f
 
 
a33b124
 
dbdef5f
 
 
 
a33b124
 
 
 
 
 
 
 
 
dbdef5f
 
3ff512f
 
dbdef5f
 
a33b124
dbdef5f
 
a33b124
 
 
 
dbdef5f
f7fbd24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f205daf
f7fbd24
f205daf
f7fbd24
c85462e
f7fbd24
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
pipeline_tag: text-generation
---

# Model Card for Breeze-7B-Instruct-v0.1

Breeze-7B-Instruct-v0.1 is a 7-billion-parameter language model built from Mistral-7B and tailored for Traditional Chinese (TC).
This model incorporates additional 30k TC tokens in vocabulary dictionary to better adapt to TC and improve inference speed, resulting in a doubling of the original tokenizer's inference speed.
Breeze-7B-Instruct-v0.1 performs well on both EN and TC benchmarks.
This model outperforms Taiwan-LLM-7B-v2.1-chat, Taiwan-LLM-13B-v2.0-chat, and Yi-6B-Chat on major TC benchmarks we tested, and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.

*A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Po-Chun Hsu 許博竣, Feng-Ting Liao 廖峰挺, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*

## Features

- Expanding the vocabulary dictionary for Traditional Chinese from 32k to 62k vocabulary size (the first successful work in Traditional Chinese)
- Multi-turn dialogue without special handling for harmfulness
- 8k context length
- Grouped-query and sliding-window attention

## Model Details
- **Finetuned from:** [MediaTek-Research/Breeze-7B-Base-v0.1](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v0.1)
- **Model type:** Causal decoder-only transformer language model
- **Language:** English and Traditional Chinese (zh-tw)

## Performance


| **Traditional Chinese Benchmarks:**                                                             | TMMLU+ (ACC) | DRCD (EM) | MT-Bench-tw (Score) |
|-------------------------------------------------------------------------------------------------------|--------|------|-------------|
| Breeze-7B-Base-v0.1                                                                                   |        |      |             |
| Breeze-7B-Instruct-v0.1                                                                               |        |      |             |
| mistralai/Mistral-7B-v0.1                                                                                       |        |      |             |
| mistralai/Mistral-7B-Instruct-v0.1                                                                              |        |      |             |
| yentinglin/Taiwan-LLM-7B-v2.1-base                                                                    |        |      |             |
| yentinglin/Taiwan-LLM-7B-v2.1-chat                                                                    |        |      |             |
| yentinglin/Taiwan-LLM-13B-v2.0-base                                                                   |        |      |             |
| yentinglin/Taiwan-LLM-13B-v2.0-chat                                                                   |        |      |             |
| 01-ai/Yi-6B-Base                                                                                       |        |      |             |
| 01-ai/Yi-6B-Chat                                                                                       |        |      |             |
| 01-ai/Yi-34B-Base                                                                                       |        |      |             |
| 01-ai/Yi-34B-Chat                                                                                       |        |      |             |
| Qwen/Qwen-7B                                                                               |        |      |             |
| Qwen/Qwen-7B-Chat                                                                          |        |      |             |
| Qwen/Qwen-14B                                                                               |        |      |             |
| Qwen/Qwen-14B-Chat                                                                          |        |      |             |
| gpt-3.5-turbo-0613                                                                                |        |      |             |




| **English Benchmarks:**                                                                                                 | MMLU (ACC) | MT-Bench (Score) |
|-------------------------------------------------------------------------------------------------------|--------|------|
| Breeze-7B-Base-v0.1                                                                                   |        |      |
| Breeze-7B-Instruct-v0.1                                                                               |        |      |
| mistralai/Mistral-7B-v0.1                                                                                       |        |      |
| mistralai/Mistral-7B-Instruct-v0.1                                                                              |        |      |
| yentinglin/Taiwan-LLM-7B-v2.1-base                                                                    |        |      |
| yentinglin/Taiwan-LLM-7B-v2.1-chat                                                                    |        |      |
| yentinglin/Taiwan-LLM-13B-v2.0-base                                                                   |        |      |
| yentinglin/Taiwan-LLM-13B-v2.0-chat                                                                   |        |      |
| 01-ai/Yi-6B-Base                                                                                       |        |      |
| 01-ai/Yi-6B-Chat                                                                                       |        |      |
| 01-ai/Yi-34B-Base                                                                                       |        |      |
| 01-ai/Yi-34B-Chat                                                                                       |        |      |
| Qwen/Qwen-7B                                                                               |        |      |             |
| Qwen/Qwen-7B-Chat                                                                          |        |      |             |
| Qwen/Qwen-14B                                                                               |        |      |             |
| Qwen/Qwen-14B-Chat                                                                          |        |      |             |
| gpt-3.5-turbo-0613                                                                                |        |      |



| **Inference Speed Test:**                                                                                                 | Speed (char/sec) 
|-------------------------------------------------------------------------------------------------------|--------|
| Breeze-7B-Instruct-v0.1                                                                               |        |
| mistralai/Mistral-7B-Instruct-v0.1                                                                              |        |
| yentinglin/Taiwan-LLM-7B-v2.1-chat                                                                    |        |
| yentinglin/Taiwan-LLM-13B-v2.0-chat                                                                   |        |
| 01-ai/Yi-6B-Chat                                                                                       |        |
| 01-ai/Yi-34B-Chat                                                                                       |        |
| Qwen/Qwen-7B-Chat                                                                          |        |      |             |
| Qwen/Qwen-14B-Chat                                                                          |        |      |             |


## Use in Transformers

First install direct dependencies:
```
pip install transformers torch accelerate
```
If you want faster inference using flash-attention2, you need to install these dependencies:
```bash
pip install packaging ninja
pip install flash-attn
```
Then load the model in transformers:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    model="MediaTek-Research/Breeze-7B-Instruct-v0.1",
    device_map="auto",
    torch_dtype=torch.bfloat16,
    use_flash_attn_2=True # optional
)
```

The structure of the query template follows that of Mistral-7B-Instruct, as shown below.
```txt
<s> SYS_PROMPT   [INST] QUERY1 [/INST] RESPONSE1 [INST] QUERY2 [/INST]
```
where `SYS_PROMPT`, `QUERY1`, `RESPONSE1`, and `QUERY2` can be provided by the user.

The suggested default `SYS_PROMPT` is 
```txt
You are a helpful AI assistant bulit by MediaTek Research. The user you helped speaks Traditional Chinese and comes from Taiwan.
```