vesteinn commited on
Commit
cf8c0ad
1 Parent(s): b9a4fd5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -12,6 +12,26 @@ license: agpl-3.0
12
 
13
  # DanskBERT
14
 
15
- This is DanskBERT, a Danish language model. Note that you should not prepend the mask with a space when using directly!
16
 
 
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  # DanskBERT
14
 
15
+ This is DanskBERT, a Danish language model. Note that you should not prepend the mask with a space when using it directly!
16
 
17
+ The model is the best performing base-size model on the [ScandEval benchmark for Danish](https://scandeval.github.io/nlu-benchmark/).
18
 
19
+ DanskBERT was trained on the Danish Gigaword Corpus (Strømberg-Derczynski et al., 2021).
20
+
21
+ DanskBERT was trained using fairseq using the RoBERTa-base configuration. The model was trained with a batch size of 2k, and was trained to convergence for 500k steps using 16 V100 cards for approximately two weeks.
22
+
23
+ If you find this model useful, please cite
24
+
25
+ ```
26
+ @inproceedings{snaebjarnarson-etal-2023-transfer,
27
+ title = "{T}ransfer to a Low-Resource Language via Close Relatives: The Case Study on Faroese",
28
+ author = "Snæbjarnarson, Vésteinn and
29
+ Simonsen, Annika and
30
+ Glavaš, Goran and Vulić, Ivan",
31
+ booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)",
32
+ month = "may 22--24",
33
+ year = "2023",
34
+ address = "Tórshavn, Faroe Islands",
35
+ publisher = {Link{\"o}ping University Electronic Press, Sweden},
36
+ }
37
+ ```