wav2vec2-large-xls-r-300m-spanish-custom

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset. It achieves the following results on the evaluation set:

Loss: 0.4426
Wer: 0.2117

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
4.2307	0.4	400	1.4431	0.9299
0.7066	0.79	800	0.5928	0.4836
0.4397	1.19	1200	0.4341	0.3730
0.3889	1.58	1600	0.4063	0.3499
0.3607	1.98	2000	0.3834	0.3235
0.2866	2.37	2400	0.3885	0.3163
0.2833	2.77	2800	0.3765	0.3140
0.2692	3.17	3200	0.3849	0.3132
0.2435	3.56	3600	0.3779	0.2984
0.2404	3.96	4000	0.3756	0.2934
0.2153	4.35	4400	0.3770	0.3075
0.2087	4.75	4800	0.3819	0.3022
0.1999	5.14	5200	0.3756	0.2959
0.1838	5.54	5600	0.3827	0.2858
0.1892	5.93	6000	0.3714	0.2999
0.1655	6.33	6400	0.3814	0.2812
0.1649	6.73	6800	0.3685	0.2727
0.1668	7.12	7200	0.3832	0.2825
0.1487	7.52	7600	0.3848	0.2788
0.152	7.91	8000	0.3810	0.2787
0.143	8.31	8400	0.3885	0.2856
0.1353	8.7	8800	0.4103	0.2827
0.1386	9.1	9200	0.4142	0.2874
0.1222	9.5	9600	0.3983	0.2830
0.1288	9.89	10000	0.4179	0.2781
0.1199	10.29	10400	0.4035	0.2789
0.1196	10.68	10800	0.4043	0.2746
0.1169	11.08	11200	0.4105	0.2753
0.1076	11.47	11600	0.4298	0.2686
0.1124	11.87	12000	0.4025	0.2704
0.1043	12.26	12400	0.4209	0.2659
0.0976	12.66	12800	0.4070	0.2672
0.1012	13.06	13200	0.4161	0.2720
0.0872	13.45	13600	0.4245	0.2697
0.0933	13.85	14000	0.4295	0.2684
0.0881	14.24	14400	0.4011	0.2650
0.0848	14.64	14800	0.3991	0.2675
0.0852	15.03	15200	0.4166	0.2617
0.0825	15.43	15600	0.4188	0.2639
0.081	15.83	16000	0.4181	0.2547
0.0753	16.22	16400	0.4103	0.2560
0.0747	16.62	16800	0.4017	0.2498
0.0761	17.01	17200	0.4159	0.2563
0.0711	17.41	17600	0.4112	0.2603
0.0698	17.8	18000	0.4335	0.2529
0.073	18.2	18400	0.4120	0.2512
0.0665	18.6	18800	0.4335	0.2496
0.0657	18.99	19200	0.4143	0.2468
0.0617	19.39	19600	0.4339	0.2435
0.06	19.78	20000	0.4179	0.2438
0.0613	20.18	20400	0.4251	0.2393
0.0583	20.57	20800	0.4347	0.2422
0.0562	20.97	21200	0.4246	0.2377
0.053	21.36	21600	0.4198	0.2338
0.0525	21.76	22000	0.4511	0.2427
0.0499	22.16	22400	0.4482	0.2353
0.0475	22.55	22800	0.4449	0.2329
0.0465	22.95	23200	0.4364	0.2320
0.0443	23.34	23600	0.4481	0.2304
0.0458	23.74	24000	0.4442	0.2267
0.0453	24.13	24400	0.4402	0.2261
0.0426	24.53	24800	0.4262	0.2232
0.0431	24.93	25200	0.4251	0.2210
0.0389	25.32	25600	0.4455	0.2232
0.039	25.72	26000	0.4372	0.2236
0.0378	26.11	26400	0.4236	0.2212
0.0348	26.51	26800	0.4359	0.2204
0.0361	26.9	27200	0.4248	0.2192
0.0356	27.3	27600	0.4397	0.2184
0.0325	27.7	28000	0.4367	0.2181
0.0313	28.09	28400	0.4477	0.2136
0.0306	28.49	28800	0.4533	0.2135
0.0314	28.88	29200	0.4410	0.2136
0.0307	29.28	29600	0.4457	0.2113
0.0309	29.67	30000	0.4426	0.2117

Framework versions

Transformers 4.16.0.dev0
Pytorch 1.10.1+cu102
Datasets 1.17.1.dev0
Tokenizers 0.11.0

tomascufaro
/

wav2vec2-large-xls-r-300m-spanish-custom

wav2vec2-large-xls-r-300m-spanish-custom

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train tomascufaro/wav2vec2-large-xls-r-300m-spanish-custom

Evaluation results