Update README.md
Browse files
README.md
CHANGED
@@ -15,10 +15,25 @@ license_name: llama3
|
|
15 |
license_link: LICENSE
|
16 |
---
|
17 |
|
|
|
|
|
18 |
## Model Details
|
19 |
|
20 |
**Llama-3-Open-Ko-8B**
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
**Meta Llama-3**
|
24 |
|
@@ -54,7 +69,7 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
|
|
54 |
<tr>
|
55 |
<td rowspan="2" >Llama-3-Open-Ko
|
56 |
</td>
|
57 |
-
<td rowspan="2" >Open-Solar-Ko Dataset
|
58 |
</td>
|
59 |
<td>8B
|
60 |
</td>
|
@@ -62,19 +77,21 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
|
|
62 |
</td>
|
63 |
<td>Yes
|
64 |
</td>
|
65 |
-
<td rowspan="2" >
|
66 |
</td>
|
67 |
<td>Jun, 2023
|
68 |
</td>
|
69 |
</tr>
|
70 |
</table>
|
71 |
|
|
|
72 |
|
73 |
-
|
|
|
74 |
|
75 |
**Status** This is a static model trained on an offline dataset.
|
76 |
|
77 |
-
**License**
|
78 |
|
79 |
## Intended Use
|
80 |
|
@@ -122,20 +139,22 @@ Please see the Responsible Use Guide available at [http://llama.meta.com/respons
|
|
122 |
|
123 |
**Llama-3-Open-Ko**
|
124 |
|
125 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
126 |
|
127 |
**Original Llama-3**
|
128 |
|
129 |
```
|
130 |
@article{llama3modelcard,
|
131 |
-
|
132 |
title={Llama 3 Model Card},
|
133 |
-
|
134 |
author={AI@Meta},
|
135 |
-
|
136 |
year={2024},
|
137 |
-
|
138 |
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
|
139 |
-
|
140 |
}
|
141 |
```
|
|
|
15 |
license_link: LICENSE
|
16 |
---
|
17 |
|
18 |
+
> Update @ 2024.04.24: Release Llama-3-Open-Ko-8B model & [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)
|
19 |
+
|
20 |
## Model Details
|
21 |
|
22 |
**Llama-3-Open-Ko-8B**
|
23 |
|
24 |
+
Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B.
|
25 |
+
|
26 |
+
This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts.
|
27 |
+
|
28 |
+
With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer).
|
29 |
+
|
30 |
+
The train was done on TPUv5e-256, with the warm support from TRC program by Google.
|
31 |
+
|
32 |
+
**Note for [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)**
|
33 |
+
|
34 |
+
With applying the idea from [Chat Vector paper](https://arxiv.org/abs/2310.04799), I released Instruction model named [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview).
|
35 |
+
|
36 |
+
Since it is NOT finetuned with any Korean instruction set(indeed `preview`), but it would be great starting point for creating new Chat/Instruct models.
|
37 |
|
38 |
**Meta Llama-3**
|
39 |
|
|
|
69 |
<tr>
|
70 |
<td rowspan="2" >Llama-3-Open-Ko
|
71 |
</td>
|
72 |
+
<td rowspan="2" >Same as *Open-Solar-Ko Dataset
|
73 |
</td>
|
74 |
<td>8B
|
75 |
</td>
|
|
|
77 |
</td>
|
78 |
<td>Yes
|
79 |
</td>
|
80 |
+
<td rowspan="2" >17.7B+
|
81 |
</td>
|
82 |
<td>Jun, 2023
|
83 |
</td>
|
84 |
</tr>
|
85 |
</table>
|
86 |
|
87 |
+
*You can find dataset list here: https://huggingface.co/beomi/OPEN-SOLAR-KO-10.7B/tree/main/corpus
|
88 |
|
89 |
+
|
90 |
+
**Model Release Date** 2024.04.24.
|
91 |
|
92 |
**Status** This is a static model trained on an offline dataset.
|
93 |
|
94 |
+
**License** Llama3 License: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
|
95 |
|
96 |
## Intended Use
|
97 |
|
|
|
139 |
|
140 |
**Llama-3-Open-Ko**
|
141 |
|
142 |
+
```
|
143 |
+
@article{llama3openko,
|
144 |
+
title={Llama-3-Open-Ko},
|
145 |
+
author={L, Junbum},
|
146 |
+
year={2024},
|
147 |
+
url={https://huggingface.co/beomi/Llama-3-Open-Ko-8B}
|
148 |
+
}
|
149 |
+
```
|
150 |
|
151 |
**Original Llama-3**
|
152 |
|
153 |
```
|
154 |
@article{llama3modelcard,
|
|
|
155 |
title={Llama 3 Model Card},
|
|
|
156 |
author={AI@Meta},
|
|
|
157 |
year={2024},
|
|
|
158 |
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
|
|
|
159 |
}
|
160 |
```
|