lots-o
/

ko-albert-base-v1

Inference Endpoints

Model card Files Files and versions Community

ko-albert-base-v1 / README.md

lots-o's picture

Update README.md

c90d80c verified 7 months ago

|

4.89 kB

	---
	license: apache-2.0
	language:
	- ko
	---

	# Korean ALBERT

	# Dataset
	- [AI-HUB](https://www.aihub.or.kr/)
	- [국립국어원 - 모두의 말뭉치](https://kli.korean.go.kr/corpus/main/requestMain.do?lang=ko)
	- [Korean News Comments](https://www.kaggle.com/junbumlee/kcbert-pretraining-corpus-korean-news-comments)


	# Evaluation results
	- The code for finetuning can be found at [KcBERT-Finetune](https://github.com/Beomi/KcBERT-finetune).

	\| \| Size(용량) \| Average Score \| NSMC<br/>(acc) \| Naver NER<br/>(F1) \| PAWS<br/>(acc) \| KorNLI<br/>(acc) \| KorSTS<br/>(spearman) \| Question Pair<br/>(acc) \| KorQuaD (Dev)<br/>(EM/F1) \|
	\|:---------------------- \|:----------:\|:-------------:\|:------------------:\|:----------------------:\|:------------------:\|:--------------------:\|:-------------------------:\|:---------------------------:\|:-----------------------------:\|
	\| KcELECTRA-base \| 475M \| 84.84 \| 91.71 \| 86.90 \| 74.80 \| 81.65 \| 82.65 \| 95.78 \| 70.60 / 90.11 \|
	\| KcELECTRA-base-v2022 \| 475M \| 85.20 \| 91.97 \| 87.35 \| 76.50 \| 82.12 \| 83.67 \| 95.12 \| 69.00 / 90.40 \|
	\| KcBERT-Base \| 417M \| 79.65 \| 89.62 \| 84.34 \| 66.95 \| 74.85 \| 75.57 \| 93.93 \| 60.25 / 84.39 \|
	\| KcBERT-Large \| 1.2G \| 81.33 \| 90.68 \| 85.53 \| 70.15 \| 76.99 \| 77.49 \| 94.06 \| 62.16 / 86.64 \|
	\| KoBERT \| 351M \| 82.21 \| 89.63 \| 86.11 \| 80.65 \| 79.00 \| 79.64 \| 93.93 \| 52.81 / 80.27 \|
	\| XLM-Roberta-Base \| 1.03G \| 84.01 \| 89.49 \| 86.26 \| 82.95 \| 79.92 \| 79.09 \| 93.53 \| 64.70 / 88.94 \|
	\| HanBERT \| 614M \| 86.24 \| 90.16 \| 87.31 \| 82.40 \| 80.89 \| 83.33 \| 94.19 \| 78.74 / 92.02 \|
	\| KoELECTRA-Base \| 423M \| 84.66 \| 90.21 \| 86.87 \| 81.90 \| 80.85 \| 83.21 \| 94.20 \| 61.10 / 89.59 \|
	\| KoELECTRA-Base-v2 \| 423M \| 86.96 \| 89.70 \| 87.02 \| 83.90 \| 80.61 \| 84.30 \| 94.72 \| 84.34 / 92.58 \|
	\| DistilKoBERT \| 108M \| 76.76 \| 88.41 \| 84.13 \| 62.55 \| 70.55 \| 73.21 \| 92.48 \| 54.12 / 77.80 \|
	\| ko-albert-base-v1 \| 51M \| 80.46 \| 86.83 \| 82.26 \| 69.95 \| 74.17 \| 74.48 \| 94.06 \| 76.08 / 86.82 \|
	\| ko-albert-large-v1 \| 75M \| 82.39 \| 86.91 \| 83.12 \| 76.10 \| 76.01 \| 77.46 \| 94.33 \| 77.64 / 87.99 \|

	*The size of HanBERT is the sum of the BERT model and the tokenizer DB.

	*These results were obtained using the default configuration settings. Better performance may be achieved with additional hyperparameter tuning.


	# How to use

	```python
	from transformers import AutoTokenizer, AutoModel

	# Base Model (51M)
	tokenizer = AutoTokenizer.from_pretrained("lots-o/ko-albert-base-v1")
	model = AutoModel.from_pretrained("lots-o/ko-albert-base-v1")

	# Large Model (75M)
	tokenizer = AutoTokenizer.from_pretrained("lots-o/ko-albert-large-v1")
	model = AutoModel.from_pretrained("lots-o/ko-albert-large-v1")
	```

	# Acknowledgement
	- The GCP/TPU environment used for training the ALBERT Model was supported by the [TRC](https://sites.research.google/trc/about/) program.

	# Reference
	## Paper
	- [ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942)

	## Github Repos
	- [google-albert](https://github.com/google-research/albert)
	- [albert-zh](https://github.com/brightmart/albert_zh)
	- [KcBERT](https://github.com/Beomi/KcBERT)
	- [KcBERT-Finetune](https://github.com/Beomi/KcBERT-finetune)