LASSL bert-ko-base
How to use
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("lassl/bert-ko-base")
tokenizer = AutoTokenizer.from_pretrained("lassl/bert-ko-base")
Evaluation
Evaulation results will be released soon.
Corpora
This model was trained from 702,437 examples (whose have 3,596,465,664 tokens). 702,437 examples are extracted from below corpora. If you want to get information for training, you should see config.json
.
corpora/
├── [707M] kowiki_latest.txt
├── [ 26M] modu_dialogue_v1.2.txt
├── [1.3G] modu_news_v1.1.txt
├── [9.7G] modu_news_v2.0.txt
├── [ 15M] modu_np_v1.1.txt
├── [1008M] modu_spoken_v1.2.txt
├── [6.5G] modu_written_v1.0.txt
└── [413M] petition.txt
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.