Whisper Medium Mnong

This model is a fine-tuned version of openai/whisper-medium on the MnongAudio-v2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0471
  • Wer: 7.2593

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
2.3012 0.1421 200 2.0190 142.3332
1.4133 0.2843 400 1.3463 99.4906
0.9797 0.4264 600 0.9503 80.8456
0.7402 0.5686 800 0.6821 62.0479
0.4908 0.7107 1000 0.4992 47.8349
0.3865 0.8529 1200 0.4090 42.6133
0.3031 0.9950 1400 0.3108 34.7937
0.203 1.1372 1600 0.2632 39.2511
0.1846 1.2793 1800 0.2209 28.3749
0.1313 1.4215 2000 0.1776 18.2119
0.0984 1.5636 2200 0.1525 18.8487
0.1009 1.7058 2400 0.1276 14.8242
0.0803 1.8479 2600 0.1034 12.1498
0.061 1.9900 2800 0.0910 11.7422
0.0327 2.1322 3000 0.0808 12.3535
0.026 2.2743 3200 0.0716 9.2970
0.024 2.4165 3400 0.0612 10.2649
0.0282 2.5586 3600 0.0552 8.0234
0.016 2.7008 3800 0.0488 7.8961
0.0237 2.8429 4000 0.0471 7.2593

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
10
Safetensors
Model size
764M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for legendary2910/Mnong-ASR-v1-enhanced

Finetuned
(476)
this model