Edit model card

./7326

This model is a fine-tuned version of openai/whisper-large-v3 on the 7326 FULL-2024-10-24 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3911
  • Wer Ortho: 22.6474
  • Wer: 15.5576

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • training_steps: 1600
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.686 0.4851 200 0.4602 26.0150 18.7885
0.5255 0.9703 400 0.4216 24.3312 17.1358
0.4328 1.4554 600 0.4028 23.2291 15.9895
0.4064 1.9406 800 0.3945 23.2291 16.1897
0.3579 2.4257 1000 0.3945 22.8195 15.7618
0.3409 2.9109 1200 0.3894 22.6884 15.5812
0.3131 3.3960 1400 0.3909 22.6556 15.6008
0.3021 3.8811 1600 0.3911 22.6474 15.5576

Framework versions

  • Transformers 4.45.1
  • Pytorch 1.13.1+cu117
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
15
Safetensors
Model size
1.61B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Makkoen/whisper-large-v3-cit-do01-wd0-lr3e-06-FULL5

Finetuned
(298)
this model