Edit model card

./4585

This model is a fine-tuned version of openai/whisper-large-v3 on the 4585 FULL-2024-09-26 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4883
  • Wer Ortho: 27.5525
  • Wer: 19.6598

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • training_steps: 1200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.7769 0.7752 200 0.5669 31.2018 22.9917
0.5359 1.5504 400 0.5151 29.0481 20.9467
0.4524 2.3256 600 0.4949 28.1973 20.0166
0.3889 3.1008 800 0.4895 27.6788 19.6471
0.3431 3.8760 1000 0.4841 27.4063 19.4368
0.3196 4.6512 1200 0.4883 27.5525 19.6598

Framework versions

  • Transformers 4.45.1
  • Pytorch 1.13.1+cu117
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
4
Safetensors
Model size
1.61B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Makkoen/whisper-large-v3-cit-do01-wd0-lr3e-06-FULL4c

Finetuned
(298)
this model