Whisper Small GA-EN Speech Translation, 1 epoch, 10k steps

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3521
  • Bleu: 34.31
  • Chrf: 52.5
  • Wer: 59.7028

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.02
  • training_steps: 13000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.6291 0.0109 100 2.33 16.34 2.1971 175.5516
2.6591 0.0219 200 5.57 22.49 2.0357 122.2873
2.5637 0.0328 300 7.67 26.29 1.8690 133.0032
2.2954 0.0438 400 11.2 30.03 1.8062 114.2278
2.3292 0.0547 500 9.85 29.28 1.7421 117.2895
2.1223 0.0657 600 14.56 32.56 1.6739 84.2864
2.2398 0.0766 700 13.86 34.74 1.7187 98.9644
2.002 0.0876 800 15.53 36.64 1.6392 96.7582
1.8611 0.0985 900 15.8 36.32 1.6283 94.3719
1.8498 0.1095 1000 17.58 36.0 1.6102 85.5921
1.7585 0.1204 1100 15.91 36.61 1.6337 100.2251
1.6115 0.1314 1200 22.21 39.94 1.5381 76.8122
1.4415 0.1423 1300 20.36 37.87 1.5864 79.1986
1.5103 0.1533 1400 23.2 41.26 1.4925 75.2364
1.6576 0.1642 1500 18.12 40.49 1.4508 102.9266
1.3429 0.1752 1600 27.88 43.74 1.4399 69.7884
1.2522 0.1861 1700 23.04 43.31 1.4256 77.1724
1.2018 0.1970 1800 21.06 40.39 1.4072 78.6583
1.1945 0.2080 1900 23.0 42.71 1.4222 76.7222
1.1869 0.2189 2000 22.54 42.02 1.3992 75.8667
1.1752 0.2299 2100 20.81 41.07 1.3926 79.5137
1.0281 0.2408 2200 27.24 45.55 1.3633 69.6083
0.894 0.2518 2300 28.6 45.58 1.3287 65.8712
0.9788 0.2627 2400 27.75 46.21 1.3138 69.2931
0.8418 0.2737 2500 27.85 46.17 1.3064 68.3026
0.7559 0.2846 2600 28.44 48.52 1.2903 68.3476
0.8632 0.2956 2700 27.87 46.86 1.2834 68.3476
0.7501 0.3065 2800 28.63 49.25 1.2669 68.5277
0.6953 0.3175 2900 30.46 48.83 1.2615 64.4304
0.7195 0.3284 3000 27.49 47.94 1.2514 71.0941
0.6155 0.3394 3100 30.06 49.64 1.2428 66.5916
0.605 0.3503 3200 31.64 50.27 1.2040 63.8451
0.6349 0.3612 3300 28.96 49.35 1.2077 65.3760
0.4669 0.3722 3400 31.17 48.95 1.2219 64.2503
0.5196 0.3831 3500 30.97 50.13 1.2124 63.8001
0.5141 0.3941 3600 31.97 50.8 1.2026 63.0347
0.4221 0.4050 3700 31.76 51.35 1.1893 63.4399
0.2951 0.4160 3800 32.4 51.08 1.2049 63.1247
0.3898 0.4269 3900 32.15 51.09 1.1906 63.5299
0.4071 0.4379 4000 33.1 51.85 1.1873 62.4043
0.3975 0.4488 4100 29.58 49.33 1.2117 70.3287
0.4206 0.4598 4200 31.69 50.8 1.2150 65.0158
0.2935 0.4707 4300 32.9 50.01 1.2484 62.8546
0.3718 0.4817 4400 31.64 50.55 1.2055 63.8451
0.3722 0.4926 4500 28.16 49.28 1.2200 70.4638
0.2986 0.5036 4600 28.76 49.9 1.2240 68.7528
0.3327 0.5145 4700 29.34 49.67 1.2052 67.5822
0.2489 0.5255 4800 32.52 51.77 1.2083 62.4493
0.3653 0.5364 4900 31.48 51.16 1.2166 63.8451
0.3326 0.5473 5000 33.04 51.71 1.2169 62.4493
0.3045 0.5583 5100 27.45 48.22 1.2460 68.9779
0.3444 0.5692 5200 33.14 50.76 1.2829 62.2692
0.3236 0.5802 5300 28.89 49.37 1.2499 70.3737
0.3004 0.5911 5400 29.89 49.29 1.3165 68.7078
0.3019 0.6021 5500 32.8 49.78 1.2782 62.8095
0.2923 0.6130 5600 31.75 50.26 1.2468 63.3498
0.3237 0.6240 5700 34.4 52.59 1.2511 61.0986
0.2226 0.6349 5800 30.51 50.38 1.2479 63.3498
0.2207 0.6459 5900 32.68 51.97 1.2641 62.1342
0.2017 0.6568 6000 32.47 51.36 1.2640 62.6745
0.201 0.6678 6100 33.6 52.29 1.2774 61.4588
0.203 0.6787 6200 30.27 50.84 1.2670 65.6461
0.1456 0.6897 6300 31.2 51.05 1.2656 63.3048
0.1607 0.7006 6400 30.39 51.04 1.2611 65.8262
0.1933 0.7115 6500 31.78 50.92 1.2545 63.0797
0.1537 0.7225 6600 30.18 50.18 1.2500 64.7006
0.1279 0.7334 6700 33.23 51.0 1.2548 59.8379
0.1189 0.7444 6800 33.51 50.67 1.2594 61.1887
0.1056 0.7553 6900 32.97 51.02 1.2578 61.9991
0.1105 0.7663 7000 32.74 50.83 1.2569 62.0441
0.1183 0.7772 7100 34.07 52.2 1.2590 60.4232
0.1373 0.7882 7200 33.55 50.6 1.2430 61.2787
0.1325 0.7991 7300 32.36 50.39 1.2548 62.3143
0.0907 0.8101 7400 32.28 50.99 1.2578 61.2787
0.0919 0.8210 7500 33.01 51.81 1.2791 60.4683
0.0852 0.8320 7600 32.97 51.56 1.2782 61.5489
0.1223 0.8429 7700 33.57 52.33 1.2638 59.9280
0.0826 0.8539 7800 33.83 52.7 1.2634 60.1531
0.0783 0.8648 7900 33.79 52.31 1.2595 60.1081
0.0986 0.8758 8000 34.33 52.54 1.2608 59.4327
0.1148 0.8867 8100 34.03 52.52 1.2736 59.8829
0.1134 0.8976 8200 34.14 51.64 1.3073 61.5038
0.1166 0.9086 8300 30.51 49.26 1.3385 65.5561
0.0871 0.9195 8400 32.31 51.06 1.3313 62.5394
0.0927 0.9305 8500 28.64 48.43 1.3898 69.3832
0.1012 0.9414 8600 33.12 52.02 1.3144 61.4138
0.0742 0.9524 8700 33.68 51.38 1.3284 61.7740
0.0802 0.9633 8800 34.33 51.38 1.3300 61.4138
0.0799 0.9743 8900 33.72 50.77 1.3328 60.1981
0.0936 0.9852 9000 34.76 51.4 1.3181 60.0630
0.1091 0.9962 9100 35.13 52.6 1.3096 59.9730
0.0427 1.0071 9200 35.49 53.12 1.2905 59.8379
0.0338 1.0181 9300 35.33 52.62 1.3097 60.5133
0.0363 1.0290 9400 35.51 53.06 1.3172 59.6128
0.0319 1.0400 9500 36.82 53.6 1.3166 58.3971
0.0434 1.0509 9600 35.62 53.28 1.3050 59.6578
0.0218 1.0619 9700 35.57 53.28 1.3096 59.5227
0.0316 1.0728 9800 36.14 53.87 1.3162 58.3971
0.0315 1.0837 9900 36.26 54.16 1.3121 58.3521
0.0229 1.0947 10000 36.12 53.74 1.3134 58.3071
0.0561 1.1056 10100 34.27 53.3 1.3263 61.0086
0.0485 1.1166 10200 34.26 53.1 1.3319 60.6934
0.0582 1.1275 10300 30.37 51.24 1.3893 70.2837
0.0559 1.1385 10400 31.61 49.4 1.4005 66.0513
0.055 1.1494 10500 31.93 50.99 1.3793 65.0608
0.0612 1.1604 10600 33.31 51.91 1.3749 62.9896
0.0599 1.1713 10700 33.87 52.96 1.3679 61.7740
0.0536 1.1823 10800 32.54 51.57 1.3313 62.2692
0.0531 1.1932 10900 33.83 52.11 1.3883 61.9991
0.0582 1.2042 11000 33.18 51.63 1.3894 61.5038
0.0506 1.2151 11100 32.51 51.24 1.3338 63.5299
0.0489 1.2261 11200 32.95 51.53 1.3625 64.2053
0.0387 1.2370 11300 34.5 52.47 1.3496 60.4232
0.0512 1.2479 11400 34.5 52.72 1.3731 60.6934
0.0459 1.2589 11500 33.27 51.89 1.3655 62.8996
0.0457 1.2698 11600 30.26 49.96 1.3824 67.7623
0.0407 1.2808 11700 31.56 51.37 1.3775 62.9446
0.0396 1.2917 11800 34.06 51.91 1.3677 59.6128
0.0419 1.3027 11900 34.18 52.77 1.3648 60.1081
0.0291 1.3136 12000 33.9 51.61 1.3697 60.6934
0.0351 1.3246 12100 34.66 53.1 1.3565 60.5133
0.0329 1.3355 12200 33.59 53.0 1.3592 61.8190
0.0409 1.3465 12300 34.41 52.96 1.3690 59.6578
0.0386 1.3574 12400 34.68 53.26 1.3440 59.1175
0.0221 1.3684 12500 33.35 51.9 1.3450 60.3332
0.032 1.3793 12600 33.09 52.07 1.3514 62.3143
0.0364 1.3903 12700 34.08 52.49 1.3538 60.0630
0.024 1.4012 12800 34.75 53.14 1.3451 58.8474
0.0245 1.4122 12900 34.09 52.38 1.3544 59.7479
0.0271 1.4231 13000 1.3521 34.31 52.5 59.7028

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
12
Safetensors
Model size
764M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ymoslem/whisper-medium-ga2en-v5.2.2-r

Finetuned
(468)
this model

Datasets used to train ymoslem/whisper-medium-ga2en-v5.2.2-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia
    self-reported
    34.310
  • Wer on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia
    self-reported
    59.703