beomi
/

Llama-3-Open-Ko-8B

@@ -15,10 +15,25 @@ license_name: llama3
 license_link: LICENSE
 ---
 ## Model Details
 **Llama-3-Open-Ko-8B**
 **Meta Llama-3**
@@ -54,7 +69,7 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
   <tr>
    <td rowspan="2" >Llama-3-Open-Ko
    </td>
-   <td rowspan="2" >Open-Solar-Ko Dataset
    </td>
    <td>8B
    </td>
@@ -62,19 +77,21 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
    </td>
    <td>Yes
    </td>
-   <td rowspan="2" >9B+
    </td>
    <td>Jun, 2023
    </td>
   </tr>
 </table>
-**Model Release Date** Preview, Not yet Release.
 **Status** This is a static model trained on an offline dataset.
-**License** A custom commercial license is available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
 ## Intended Use
@@ -122,20 +139,22 @@ Please see the Responsible Use Guide available at [http://llama.meta.com/respons
 **Llama-3-Open-Ko**
-TBD
 **Original Llama-3**
 ```
 @article{llama3modelcard,
   title={Llama 3 Model Card},
   author={AI@Meta},
   year={2024},
   url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
 }
 ```

 license_link: LICENSE
 ---
+> Update @ 2024.04.24: Release Llama-3-Open-Ko-8B model & [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)
 ## Model Details
 **Llama-3-Open-Ko-8B**
+Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B.
+This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts.
+With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer).
+The train was done on TPUv5e-256, with the warm support from TRC program by Google.
+**Note for [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)**
+With applying the idea from [Chat Vector paper](https://arxiv.org/abs/2310.04799), I released Instruction model named [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview).
+Since it is NOT finetuned with any Korean instruction set(indeed `preview`), but it would be great starting point for creating new Chat/Instruct models.
 **Meta Llama-3**
   <tr>
    <td rowspan="2" >Llama-3-Open-Ko
    </td>
+   <td rowspan="2" >Same as *Open-Solar-Ko Dataset
    </td>
    <td>8B
    </td>
    </td>
    <td>Yes
    </td>
+   <td rowspan="2" >17.7B+
    </td>
    <td>Jun, 2023
    </td>
   </tr>
 </table>
+*You can find dataset list here: https://huggingface.co/beomi/OPEN-SOLAR-KO-10.7B/tree/main/corpus
+**Model Release Date** 2024.04.24.
 **Status** This is a static model trained on an offline dataset.
+**License** Llama3 License: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
 ## Intended Use
 **Llama-3-Open-Ko**
+```
+@article{llama3openko,
+  title={Llama-3-Open-Ko},
+  author={L, Junbum},
+  year={2024},
+  url={https://huggingface.co/beomi/Llama-3-Open-Ko-8B}
+}
+```
 **Original Llama-3**
 ```
 @article{llama3modelcard,
   title={Llama 3 Model Card},
   author={AI@Meta},
   year={2024},
   url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
 }
 ```