whisper-md-el-intlv-xs

This model is a fine-tuned version of openai/whisper-medium on interleaved mozilla-foundation/common_voice_11_0 (el) and the google/fleurs (el_gr) datasets. It achieves the following results on the mozilla-foundation/common_voice_11_0 test evaluation set:

  • Loss: 0.4168
  • Wer: 11.3670

Model description

This model is trained over the two interleaved datasets in the Greek language. Testing used only the common_voice_11_0 (el) test split.

Intended uses & limitations

The model was trained for transcription in Greek

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.0251 2.49 1000 0.2216 12.5836
0.0051 4.98 2000 0.2874 12.2957
0.0015 7.46 3000 0.3281 11.9056
0.0017 9.95 4000 0.3178 12.5929
0.0008 12.44 5000 0.3449 11.9799
0.0001 14.93 6000 0.3638 11.7106
0.0001 17.41 7000 0.3910 11.4970
0.0 19.9 8000 0.4042 11.3949
0.0 22.39 9000 0.4129 11.4134
0.0 24.88 10000 0.4168 11.3670

Framework versions

  • Transformers 4.26.0.dev0
  • Pytorch 1.13.0+cu117
  • Datasets 2.7.1.dev0
  • Tokenizers 0.13.2
Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train farsipal/whisper-md-el-intlv-xs

Evaluation results