metadata
library_name: transformers
license: apache-2.0
datasets:
- bigscience-data/roots_zh-tw_wikipedia
- bigscience-data/roots_en_wikipedia
language:
- zh
Model Card for Chinese-OpenELM-270M
Finetuned from apple/OpenELM-270M:
- Extended vocabulary from 32000 to 75873 with sentencepiece bpe trained on bigscience-data/roots_zh-tw_wikipedia and used average embedding to initialize the new embeddings.
- Continual pre-trained with a mix of bigscience-data/roots_zh-tw_wikipedia and bigscience-data/roots_en_wikipedia.
- Evaluation ppl = 1.6644828403646825 (split 3% training data as evaluation set)