|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
datasets: |
|
- bigscience-data/roots_zh-tw_wikipedia |
|
- bigscience-data/roots_en_wikipedia |
|
language: |
|
- zh |
|
--- |
|
|
|
# Model Card for Chinese-OpenELM-270M |
|
|
|
Finetuned from [apple/OpenELM-270M](https://huggingface.co/apple/OpenELM-270M): |
|
|
|
* Extended vocabulary from 32000 to 75873 with sentencepiece bpe trained on [bigscience-data/roots_zh-tw_wikipedia](https://huggingface.co/datasets/bigscience-data/roots_zh-tw_wikipedia) and used average embedding to initialize the new embeddings. |
|
* Continual pre-trained with a mix of [bigscience-data/roots_zh-tw_wikipedia](https://huggingface.co/datasets/bigscience-data/roots_zh-tw_wikipedia) and [bigscience-data/roots_en_wikipedia](https://huggingface.co/datasets/bigscience-data/roots_en_wikipedia). |
|
* Evaluation ppl = 1.6644828403646825 (split 3% training data as evaluation set) |