Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,36 @@
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
3 |
+
language:
|
4 |
+
- nso
|
5 |
+
- en
|
6 |
+
pipeline_tag: text2text-generation
|
7 |
+
tags:
|
8 |
+
- m2m100
|
9 |
+
- translation
|
10 |
+
- africanlp
|
11 |
+
- african
|
12 |
+
- siswati
|
13 |
---
|
14 |
+
|
15 |
+
# [nso-en] Northen Sotho [Sepedi] to English Translation Model based on M2M100 and The South African Gov-ZA multilingual corpus
|
16 |
+
|
17 |
+
Model created from Northen Sotho [Sepedi] to English aligned sentences from [The South African Gov-ZA multilingual corpus](https://github.com/dsfsi/gov-za-multilingual)
|
18 |
+
|
19 |
+
The data set contains cabinet statements from the South African government, maintained by the Government Communication and Information System (GCIS). Data was scraped from the governments website: https://www.gov.za/cabinet-statements
|
20 |
+
|
21 |
+
## Authors
|
22 |
+
- Vukosi Marivate - [@vukosi](https://twitter.com/vukosi)
|
23 |
+
- Matimba Shingange
|
24 |
+
- Richard Lastrucci
|
25 |
+
- Isheanesu Joseph Dzingirai
|
26 |
+
- Jenalea Rajab
|
27 |
+
|
28 |
+
## BibTeX entry and citation info
|
29 |
+
```
|
30 |
+
@article{lastrucci2023preparing,
|
31 |
+
title = {Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora},
|
32 |
+
author = {Richard Lastrucci and Isheanesu Dzingirai and Jenalea Rajab and Andani Madodonga and Matimba Shingange and Daniel Njini and Vukosi Marivate},
|
33 |
+
year = {2023},
|
34 |
+
journal = {arXiv preprint arXiv: Arxiv-2303.03750}
|
35 |
+
}
|
36 |
+
```
|