Mnong-ASR-v3 / README.md
legendary2910's picture
Upload tokenizer
e1d39ae verified
metadata
base_model: openai/whisper-base
language:
  - vi
license: apache-2.0
metrics:
  - wer
tags:
  - hf-asr-leaderboard
  - generated_from_trainer
model-index:
  - name: Whisper Base Mnong
    results: []

Whisper Base Mnong

This model is a fine-tuned version of openai/whisper-base on the MnongAudio-v2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7611
  • Wer: 77.7127

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
2.7389 0.2915 200 2.6665 373.6628
2.2233 0.5831 400 2.2426 189.4549
1.8164 0.8746 600 1.8990 131.6353
1.5731 1.1662 800 1.6678 124.3760
1.4459 1.4577 1000 1.4828 95.8227
1.3009 1.7493 1200 1.3453 96.9689
1.0242 2.0408 1400 1.2264 89.9898
0.9227 2.3324 1600 1.1492 80.0815
0.9111 2.6239 1800 1.0539 83.2399
0.8831 2.9155 2000 0.9899 88.1814
0.5906 3.2070 2200 0.9452 84.5899
0.54 3.4985 2400 0.9017 79.6740
0.542 3.7901 2600 0.8713 72.2364
0.4606 4.0816 2800 0.8320 72.9241
0.4879 4.3732 3000 0.8172 75.4712
0.4033 4.6647 3200 0.7940 75.9552
0.4235 4.9563 3400 0.7737 73.2552
0.3638 5.2478 3600 0.7704 79.2155
0.383 5.5394 3800 0.7641 77.7382
0.3714 5.8309 4000 0.7611 77.7127

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1