Whisper Base Mnong

This model is a fine-tuned version of openai/whisper-base on the MnongAudio-v2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5864
  • Wer: 73.4845

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
2.7525 0.1421 200 2.6537 416.5818
2.2459 0.2843 400 2.2237 158.5838
1.8682 0.4264 600 1.8896 237.7483
1.7212 0.5686 800 1.6295 110.0866
1.4164 0.7107 1000 1.4443 108.9913
1.2698 0.8529 1200 1.3000 91.3653
1.1479 0.9950 1400 1.1657 102.3688
1.0034 1.1372 1600 1.0799 84.6918
0.945 1.2793 1800 0.9844 85.7106
0.8249 1.4215 2000 0.8974 87.1880
0.726 1.5636 2200 0.8412 92.9699
0.7561 1.7058 2400 0.7859 80.8202
0.6884 1.8479 2600 0.7328 85.3031
0.6329 1.9900 2800 0.6872 80.9985
0.5129 2.1322 3000 0.6672 76.4901
0.5361 2.2743 3200 0.6369 78.2985
0.482 2.4165 3400 0.6178 75.9042
0.5211 2.5586 3600 0.6030 79.3938
0.4749 2.7008 3800 0.5905 76.0316
0.4648 2.8429 4000 0.5864 73.4845

Framework versions

  • Transformers 4.43.4
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
72.6M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for legendary2910/Mnong-ASR-v3-enhanced

Finetuned
(367)
this model