Swahili TTS
Collection
Open-source Swahili Text-to-Speech models and datasets!
•
14 items
•
Updated
•
1
VITS Base sw-KE-OpenBible is an end-to-end text-to-speech model based on the VITS architecture. This model was trained from scratch on a real audio dataset. The list of real speakers include:
The model's vocabulary contains the different IPA phonemes found in gruut.
This model was trained using VITS framework. All training was done on a Scaleway L40S VM with a NVIDIA L40S GPU. All necessary scripts used for training could be found in the Files and versions tab, as well as the Training metrics logged via Tensorboard.
Model | SR (Hz) | Mel range (Hz) | FFT / Hop / Win | #epochs |
---|---|---|---|---|
VITS Base sw-KE-OpenBible | 44.1K | 0-null | 2048 / 512 / 2048 | 12000 |
python preprocess.py \
--text_index 1 \
--filelists filelists/sw-KE-OpenBible_text_train_filelist.txt filelists/sw-KE-OpenBible_text_val_filelist.txt \
--text_cleaners swahili_cleaners
python train.py -c configs/sw_ke_openbible_base.json -m sw_ke_openbible_base