CroSloEngual BERT
CroSloEngual BERT is a trilingual model, using bert-base architecture, trained on Croatian, Slovenian, and English corpora. Focusing on three languages, the model performs better than multilingual BERT, while still offering an option for cross-lingual knowledge transfer, which a monolingual model wouldn't.
Evaluation is presented in our article:
@Inproceedings{ulcar-robnik2020finest,
author = "Ulčar, M. and Robnik-Šikonja, M.",
year = 2020,
title = "{FinEst BERT} and {CroSloEngual BERT}: less is more in multilingual models",
editor = "Sojka, P and Kopeček, I and Pala, K and Horák, A",
booktitle = "Text, Speech, and Dialogue {TSD 2020}",
series = "Lecture Notes in Computer Science",
volume = 12284,
publisher = "Springer",
url = "https://doi.org/10.1007/978-3-030-58323-1_11",
}
The preprint is available at arxiv.org/abs/2006.07890.
- Downloads last month
- 152
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.