UGARIT commited on
Commit
d071c0b
·
1 Parent(s): 631c674

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -2,4 +2,10 @@
2
  license: cc-by-4.0
3
  ---
4
  # Automatic Translation Alignment of Ancient Greek Texts
5
- GRC-ALIGNMENT model is an XLM-RoBERTa-based model, trained on 12 million monolingual ancient Greek tokens. and 45k parallel sentences mainly in ancient Greek-English, ancient Greek-Latin, and ancient Greek-Georgian.
 
 
 
 
 
 
 
2
  license: cc-by-4.0
3
  ---
4
  # Automatic Translation Alignment of Ancient Greek Texts
5
+ GRC-ALIGNMENT model is an XLM-RoBERTa-based model, trained on 12 million monolingual ancient Greek tokens with Masked Language Model (MLM) training objective. Further, the model is fine-tuned on 45k parallel sentences mainly in ancient Greek-English, ancient Greek-Latin, and ancient Greek-Georgian.
6
+
7
+ ### Multilingual Training Dataset
8
+ | Languages | # Sentences | Source |
9
+ |:---------:|:-----------:|:--------------------------------------------------------------------------------:|
10
+ | GRC-ENG | 32.500 | Perseus Digital Library (Iliad, Odyssey, Xenophon, New Testament) |
11
+ | GRC-LAT | 8.200 | Digital Fragmenta Historicorum Graecorum project (https://www.dfhg-project.org/) |