Automatic Speech Recognition
TensorBoard
Safetensors
Welsh
wav2vec2
Generated from Trainer
DewiBrynJones's picture
End of training
1c61e48 verified
|
raw
history blame
2.47 kB
metadata
license: apache-2.0
base_model: facebook/wav2vec2-large-xlsr-53
tags:
  - automatic-speech-recognition
  - ./data-configs/btb-cv.json
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: wav2vec2-xlsr-53-ft-btb-cv-cy-cand
    results: []

wav2vec2-xlsr-53-ft-btb-cv-cy-cand

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: inf
  • Wer: 0.3598

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 800
  • training_steps: 8000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
4.6973 0.0714 500 inf 1.0
1.448 0.1428 1000 inf 0.7574
1.053 0.2142 1500 inf 0.6584
0.9304 0.2856 2000 inf 0.5963
0.8755 0.3569 2500 inf 0.5946
0.8238 0.4283 3000 inf 0.5392
0.7819 0.4997 3500 inf 0.4967
0.729 0.5711 4000 inf 0.4834
0.6923 0.6425 4500 inf 0.4564
0.7052 0.7139 5000 inf 0.4346
0.6675 0.7853 5500 inf 0.4163
0.6217 0.8567 6000 inf 0.3962
0.5954 0.9280 6500 inf 0.3883
0.5687 0.9994 7000 inf 0.3746
0.477 1.0708 7500 inf 0.3647
0.4804 1.1422 8000 inf 0.3598

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1