SakshiRathi77's picture
Update README.md
63e3ce6
metadata
license: apache-2.0
base_model: facebook/wav2vec2-xls-r-300m
tags:
  - generated_from_trainer
  - code
metrics:
  - wer
model-index:
  - name: wav2vec2-large-xlsr-300m-hi-kagglex
    results: []
datasets:
  - mozilla-foundation/common_voice_15_0
  - mozilla-foundation/common_voice_13_0
language:
  - hi
library_name: transformers
pipeline_tag: automatic-speech-recognition

datasets:

  • mozilla-foundation/common_voice_15_0
  • mozilla-foundation/common_voice_13_0 language:
  • hi metrics:
  • cer
  • wer library_name: transformers pipeline_tag: automatic-speech-recognition model-index:
    • name: whisper-small-hi-cv results:
      • task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Common Voice 15 type: mozilla-foundation/common_voice_15_0 args: hi metrics:

        • name: Test WER type: wer value: 13.9913
        • name: Test CER type: cer value: 5.8844
      • task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Common Voice 13 type: mozilla-foundation/common_voice_13_0 args: hi metrics:

        • name: Test WER type: wer value: 23.3824
        • name: Test CER type: cer value: 10.5288

Model Details

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on this dataset . It achieves the following results on the evaluation set:

  • Loss: 0.3691
  • Wer: 0.3285
  • Cer: 0.0875

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Wer Cer
7.314 19.05 300 3.4661 1.0 1.0
2.5698 38.1 600 0.6577 0.5203 0.1466
0.6112 57.14 900 0.4048 0.3723 0.1005
0.3826 76.19 1200 0.3778 0.3386 0.0901
0.3168 95.24 1500 0.3691 0.3285 0.0875

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.13.3

SPACE

Automatic Speech Recognization in hindi