MediaTek-Research
/

Breeze-7B-Instruct-v0_1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

YC-Chen commited on Jan 8

Commit

e938331

•

1 Parent(s): a039286

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ resulting in a doubling of the original tokenizer's inference speed.
 To the best of our knowledge, this is the first work on vocabulary expansion in TC.
 This model uses 250GB of TC data for continued pre-training and further uses over 1M instances for fine-tuning.
 Breeze-7B-Instruct-v0.1 performs well on both EN and TC benchmarks.
-This model outperforms Taiwan-LLM-7B-v2.1-chat, Taiwan-LLM-13B-v2.0-chat, and Yi-6B-Chat on the most TC benchmarks we tested
 and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.
 *A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*

 To the best of our knowledge, this is the first work on vocabulary expansion in TC.
 This model uses 250GB of TC data for continued pre-training and further uses over 1M instances for fine-tuning.
 Breeze-7B-Instruct-v0.1 performs well on both EN and TC benchmarks.
+This model outperforms Taiwan-LLM-7B-v2.1-chat, Taiwan-LLM-13B-v2.0-chat, and Yi-6B-Chat on all TC benchmarks
 and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.
 *A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*