MediaTek-Research
/

Breeze-7B-Instruct-v0_1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

YC-Chen commited on Jan 8

Commit

f7fbd24

•

1 Parent(s): 7b7eaa0

Update README.md

Files changed (1) hide show

README.md +34 -3

README.md CHANGED Viewed

@@ -23,8 +23,39 @@ This model outperforms Taiwan-LLM-7B-v2.1-chat, Taiwan-LLM-13B-v2.0-chat, and Yi
 - **Model type:** Causal decoder-only transformer language model
 - **Language:** English and Traditional Chinese (zh-tw)
-## Prompt Template
 ## Performance

 - **Model type:** Causal decoder-only transformer language model
 - **Language:** English and Traditional Chinese (zh-tw)
 ## Performance
+## Use in Transformers
+First install direct dependencies:
+```
+pip install transformers torch accelerate
+```
+If you want faster inference using flash-attention2, you need to install these dependencies:
+```bash
+pip install packaging ninja
+pip install flash-attn
+```
+Then load the model in transformers:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model = AutoModelForCausalLM.from_pretrained(
+    model="MediaTek-Research/Breeze-7B-Instruct-v0.1",
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+    use_flash_attn_2=True # optional
+)
+```
+The structure of the prompt template follows that of Mistral-7B-Instruct, as shown below.
+```txt
+<s> SYS_PROMPT   [INST] QUERY1 [/INST] RESPONSE1 [INST] QUERY2 [/INST] RESPONSE2</s>
+```
+The suggested default `SYS_PROMPT` is
+```txt
+You are a helpful AI assistant bulit by MediaTek Research. The user you helped speaks Traditional Chinese and comes from Taiwan.
+```