SR
commited on
Commit
•
d6444c6
1
Parent(s):
ea62bb9
update1
Browse files
README.md
CHANGED
@@ -3,11 +3,9 @@ language:
|
|
3 |
- th
|
4 |
- en
|
5 |
license: llama3
|
6 |
-
datasets:
|
7 |
-
- airesearch/concat_six_dataset_th_en
|
8 |
---
|
9 |
|
10 |
-
# LLaMa3-8b-WangchanX-sft-
|
11 |
|
12 |
Built with Meta Llama 3 (Fine tuning with Qlora)
|
13 |
|
@@ -22,6 +20,7 @@ Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright ©
|
|
22 |
## Train Example
|
23 |
|
24 |
Train WangchanX pipeline: [Colab](https://colab.research.google.com/github/vistec-AI/WangchanX/blob/main/notebooks/Train_WangchanX_pipeline.ipynb)
|
|
|
25 |
|
26 |
## Inference Example
|
27 |
|
@@ -34,7 +33,7 @@ import torch
|
|
34 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
35 |
|
36 |
# Model path
|
37 |
-
path = "airesearch/LLaMa3-8b-WangchanX-sft-
|
38 |
|
39 |
# Device
|
40 |
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
@@ -48,7 +47,10 @@ model = AutoModelForCausalLM.from_pretrained(path, device_map="auto")
|
|
48 |
|
49 |
```python
|
50 |
messages = [
|
51 |
-
{"role": "
|
|
|
|
|
|
|
52 |
]
|
53 |
```
|
54 |
|
@@ -64,7 +66,7 @@ print(tokenizer.decode(tokenized_chat[0]))
|
|
64 |
<br>
|
65 |
<pre lang="markdown">
|
66 |
<|user|>
|
67 |
-
|
68 |
<|assistant|></pre>
|
69 |
</details>
|
70 |
|
@@ -80,7 +82,7 @@ print(tokenizer.decode(outputs[0]))
|
|
80 |
<br>
|
81 |
<pre lang="markdown">
|
82 |
<|user|>
|
83 |
-
|
84 |
<|assistant|>
|
85 |
-
|
86 |
-
</details>
|
|
|
3 |
- th
|
4 |
- en
|
5 |
license: llama3
|
|
|
|
|
6 |
---
|
7 |
|
8 |
+
# LLaMa3-8b-WangchanX-sft-Full
|
9 |
|
10 |
Built with Meta Llama 3 (Fine tuning with Qlora)
|
11 |
|
|
|
20 |
## Train Example
|
21 |
|
22 |
Train WangchanX pipeline: [Colab](https://colab.research.google.com/github/vistec-AI/WangchanX/blob/main/notebooks/Train_WangchanX_pipeline.ipynb)
|
23 |
+
This model train on public dataset around 400GB.
|
24 |
|
25 |
## Inference Example
|
26 |
|
|
|
33 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
34 |
|
35 |
# Model path
|
36 |
+
path = "airesearch/LLaMa3-8b-WangchanX-sft-Full"
|
37 |
|
38 |
# Device
|
39 |
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
|
|
47 |
|
48 |
```python
|
49 |
messages = [
|
50 |
+
{"role": "system", "content": "You are a helpful, respectful and honest assistant. Your answers should not include any harmful,
|
51 |
+
unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially
|
52 |
+
unbiased and positive in nature. If you don’t know the answer to a question, please don’t share false information. please answer the question in Thai."},
|
53 |
+
{"role": "user", "content": "บอกเทคนิคการเรียนที่มีประสิทธิภาพ"},
|
54 |
]
|
55 |
```
|
56 |
|
|
|
66 |
<br>
|
67 |
<pre lang="markdown">
|
68 |
<|user|>
|
69 |
+
บอกเทคนิคการเรียนที่มีประสิทธิภาพ<|end_of_text|>
|
70 |
<|assistant|></pre>
|
71 |
</details>
|
72 |
|
|
|
82 |
<br>
|
83 |
<pre lang="markdown">
|
84 |
<|user|>
|
85 |
+
บอกเทคนิคการเรียนที่มีประสิทธิภาพ<|end_of_text|>
|
86 |
<|assistant|>
|
87 |
+
ใช้เวลาให้เป็นประโยชน์ ลองทำสมาธิก่อนค่อยอ่านหนังสือ ดูว่าวิธีการศึกษาทำได้ดีที่สุดแค่ไหน ให้จดโน้ตขณะฟังเลกเชอร์ในห้องเรียน ทำข้อสอบฝึกตัวเอง ถ้าตอบผิดให้ค้นหาเพิ่มเติมเอง อย่าลืมอ่านทบทวนก่อนสอบ!
|
88 |
+
</details>
|