metadata
language:
- ug
license: apache-2.0
datasets:
- THUGY20
metrics:
- cer
model-index:
- name: Whisper Small Fine-tuned with THUYG20 Uyghur Dataset
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: 'THUGY20: A free Uyghur speech database'
type: THUGY20
metrics:
- name: Cer
type: cer
value: 7.21
Uyghur Automatic Speech Recognition
Uyghur ASR using CTC loss trained with THUYG20 dataset It achieves the following results on the evaluation set:
- Best CER: 7.21%
Reference: https://github.com/gheyret/uyghur-asr-ctc
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 20
- eval_batch_size: 20
- seed: 42
- optimizer: Adam with weight_decay=0.000001
Training results
Best CER: 7.21% Trained: 473 epochs The model has 26,389,282 trainable parameters