vukosi commited on
Commit
118534c
1 Parent(s): d8c3f12

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md CHANGED
@@ -1,3 +1,36 @@
1
  ---
2
  license: cc-by-4.0
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ language:
4
+ - nso
5
+ - en
6
+ pipeline_tag: text2text-generation
7
+ tags:
8
+ - m2m100
9
+ - translation
10
+ - africanlp
11
+ - african
12
+ - siswati
13
  ---
14
+
15
+ # [nso-en] Northen Sotho [Sepedi] to English Translation Model based on M2M100 and The South African Gov-ZA multilingual corpus
16
+
17
+ Model created from Northen Sotho [Sepedi] to English aligned sentences from [The South African Gov-ZA multilingual corpus](https://github.com/dsfsi/gov-za-multilingual)
18
+
19
+ The data set contains cabinet statements from the South African government, maintained by the Government Communication and Information System (GCIS). Data was scraped from the governments website: https://www.gov.za/cabinet-statements
20
+
21
+ ## Authors
22
+ - Vukosi Marivate - [@vukosi](https://twitter.com/vukosi)
23
+ - Matimba Shingange
24
+ - Richard Lastrucci
25
+ - Isheanesu Joseph Dzingirai
26
+ - Jenalea Rajab
27
+
28
+ ## BibTeX entry and citation info
29
+ ```
30
+ @article{lastrucci2023preparing,
31
+ title = {Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora},
32
+ author = {Richard Lastrucci and Isheanesu Dzingirai and Jenalea Rajab and Andani Madodonga and Matimba Shingange and Daniel Njini and Vukosi Marivate},
33
+ year = {2023},
34
+ journal = {arXiv preprint arXiv: Arxiv-2303.03750}
35
+ }
36
+ ```