nso-en-m2m100-gov / README.md
vukosi's picture
Update README.md
84e1463
|
raw
history blame
1.39 kB
metadata
license: cc-by-4.0
language:
  - nso
  - en
pipeline_tag: text2text-generation
tags:
  - m2m100
  - translation
  - africanlp
  - african
  - sepedi
  - northern-sotho

[nso-en] Northen Sotho [Sepedi] to English Translation Model based on M2M100 and The South African Gov-ZA multilingual corpus

Model created from Northen Sotho [Sepedi] to English aligned sentences from The South African Gov-ZA multilingual corpus

The data set contains cabinet statements from the South African government, maintained by the Government Communication and Information System (GCIS). Data was scraped from the governments website: https://www.gov.za/cabinet-statements

Authors

  • Vukosi Marivate - @vukosi
  • Matimba Shingange
  • Richard Lastrucci
  • Isheanesu Joseph Dzingirai
  • Jenalea Rajab

BibTeX entry and citation info

@article{lastrucci2023preparing,
  title   = {Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora},
  author  = {Richard Lastrucci and Isheanesu Dzingirai and Jenalea Rajab and Andani Madodonga and Matimba Shingange and Daniel Njini and Vukosi Marivate},
  year    = {2023},
  journal = {arXiv preprint arXiv: Arxiv-2303.03750}
}

Paper - Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora