Spaces:
Running
Running
How to prepare dataset for training
- Download Ukrainian dataset from https://github.com/egorsmkv/speech-recognition-uk.
- Delete Common Voice folder in dataset
- Download import_ukrainian.py and put into DeepSpeech/bin folder.
- Run import script
- Download Common Voice 6.1 Ukrainian dataset
- Convert to DeepSpeech format
- Merge train.csv from dataset and from DeepSpeech into one file
- Put CV files into dataset files folder
- Put dev.csv and test.csv into folder
You have a reproducible dataset!