distil-whisper-large-v3-tr

Model Description

distil-whisper-large-v3-tr is a distilled version of the Whisper model, fine-tuned for Turkish language tasks. This model has been trained and evaluated using a comprehensive dataset to achieve high accuracy in Turkish speech recognition.

Training and Evaluation Metrics

The model was trained and evaluated using the wandb tool, with the following results:

Evaluation Metrics

Cross-Entropy Loss (eval/ce_loss): 0.53218
Epoch (eval/epoch): 28
KL Loss (eval/kl_loss): 0.34883
Total Loss (eval/loss): 0.77457
Evaluation Time (eval/time): 397.1784 seconds
Word Error Rate (eval/wer): 14.43288%
Orthographic Word Error Rate (eval/wer_ortho): 21.55298%

Training Metrics

Cross-Entropy Loss (train/ce_loss): 0.04695
Epoch (train/epoch): 28
KL Loss (train/kl_loss): 0.24143
Learning Rate (train/learning_rate): 0.0001
Total Loss (train/loss): 0.27899
Training Time (train/time): 12426.92106 seconds

Run History

Overall Metrics

Real-Time Factor (all/rtf): 392.23396
Word Error Rate (all/wer): 14.33829

Common Voice 17.0 Turkish Pseudo-Labelled Dataset

Real-Time Factor (common_voice_17_0_tr_pseudo_labelled/test/rtf): 392.23396
Word Error Rate (common_voice_17_0_tr_pseudo_labelled/test/wer): 14.33829

Author

Sercan Çepni
Email: turkelf@gmail.com

For any questions or further information, please feel free to contact the author.

Sercan
/

distil-whisper-large-v3-tr