Whisper Base Mnong

This model is a fine-tuned version of openai/whisper-base on the MnongAudio-v2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7611
  • Wer: 77.7127

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
2.7389 0.2915 200 2.6665 373.6628
2.2233 0.5831 400 2.2426 189.4549
1.8164 0.8746 600 1.8990 131.6353
1.5731 1.1662 800 1.6678 124.3760
1.4459 1.4577 1000 1.4828 95.8227
1.3009 1.7493 1200 1.3453 96.9689
1.0242 2.0408 1400 1.2264 89.9898
0.9227 2.3324 1600 1.1492 80.0815
0.9111 2.6239 1800 1.0539 83.2399
0.8831 2.9155 2000 0.9899 88.1814
0.5906 3.2070 2200 0.9452 84.5899
0.54 3.4985 2400 0.9017 79.6740
0.542 3.7901 2600 0.8713 72.2364
0.4606 4.0816 2800 0.8320 72.9241
0.4879 4.3732 3000 0.8172 75.4712
0.4033 4.6647 3200 0.7940 75.9552
0.4235 4.9563 3400 0.7737 73.2552
0.3638 5.2478 3600 0.7704 79.2155
0.383 5.5394 3800 0.7641 77.7382
0.3714 5.8309 4000 0.7611 77.7127

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
12
Safetensors
Model size
72.6M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for legendary2910/Mnong-ASR-v3

Finetuned
(369)
this model