versae's picture
Update README.md
fc65e12
metadata
language:
  - smj
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: salmon-whisper-large-smj-lr7e-5
    results: []

salmon-whisper-large-smj-lr7e-5

This model is a fine-tuned version of openai/whisper-large-v2 on the NbAiLab/salmon-asr-smj dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7e-05
  • lr_scheduler_type: linear
  • per_device_train_batch_size: 6
  • total_train_batch_size_per_node: 48
  • total_train_batch_size: 48
  • total_optimization_steps: 100,000
  • starting_optimization_step: None
  • finishing_optimization_step: 100,000
  • num_train_dataset_workers: 32
  • num_hosts: 1
  • total_num_training_examples: 4,800,000
  • steps_per_epoch: 385
  • num_beams: None
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • dropout: True
  • bpe_dropout_probability: 0.2
  • activation_dropout_probability: 0.1

Training results

step validation_loss train_loss validation_wer validation_cer validation_exact_wer validation_exact_cer
0 4.2254 4.7016 112.7660 59.8700 108.1117 62.0594
10000 0.8352 0.3801 18.0851 5.5037 22.2074 6.0546
20000 1.1701 0.2580 15.2926 5.0569 17.8191 5.4770
30000 1.0861 0.2306 14.4947 4.6304 17.2872 4.9393
40000 1.0137 0.2168 14.2287 4.3461 16.7553 4.6206
50000 1.1032 0.2313 13.6968 3.8993 16.0904 4.1625
60000 1.2158 0.2121 13.6968 4.2648 16.0904 4.5808
70000 1.1595 0.2567 12.3670 3.7977 14.7606 4.1028
80000 1.1889 0.2420 12.5 4.0008 14.7606 4.2621
90000 1.1761 0.2039 12.6330 4.0211 14.8936 4.3218

Framework versions

  • Transformers 4.35.0
  • Datasets 2.14.6
  • Tokenizers 0.14.1