wav2vec2-large-xls-r-300m-Arabic-phoneme-based

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7493
  • Per: 0.1979

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 250
  • num_epochs: 30.0

Training results

Training Loss Epoch Step Validation Loss Per
1.9601 1.0 2187 1.7221 0.9190
1.307 2.0 4374 1.0964 0.4532
0.9363 3.0 6561 0.9163 0.3469
0.7942 4.0 8748 0.8432 0.3037
0.7 5.0 10935 0.7827 0.2881
0.6274 6.0 13122 0.7456 0.2713
0.5692 7.0 15309 0.6924 0.2572
0.5203 8.0 17496 0.6521 0.2491
0.4853 9.0 19683 0.6583 0.2420
0.4448 10.0 21870 0.6580 0.2312
0.4134 11.0 24057 0.6313 0.2380
0.389 12.0 26244 0.6099 0.2225
0.3644 13.0 28431 0.6238 0.2239
0.3432 14.0 30618 0.6369 0.2195
0.3191 15.0 32805 0.6391 0.2164
0.2992 16.0 34992 0.6314 0.2164
0.2827 17.0 37179 0.6385 0.2143
0.2666 18.0 39366 0.6330 0.2159
0.2479 19.0 41553 0.6653 0.2125
0.2341 20.0 43740 0.6692 0.2165
0.2209 21.0 45927 0.6656 0.2199
0.2075 22.0 48114 0.6669 0.2104
0.1955 23.0 50301 0.6830 0.2044
0.1825 24.0 52488 0.6973 0.2065
0.1758 25.0 54675 0.7265 0.2013
0.1644 26.0 56862 0.7416 0.2040
0.1571 27.0 59049 0.7202 0.2007
0.1489 28.0 61236 0.7224 0.2019
0.1432 29.0 63423 0.7357 0.1988
0.1373 30.0 65610 0.7493 0.1979

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu118
  • Datasets 1.18.3
  • Tokenizers 0.13.3
Downloads last month
28
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nrshoudi/wav2vec2-large-xls-r-300m-Arabic-phoneme-based

Finetuned
(476)
this model