File size: 1,987 Bytes
96e8020 ef46c45 eac3a78 61f2f82 3c01bec 96e8020 61f2f82 3c01bec ef46c45 3c01bec ef46c45 3e128f6 96e8020 61f2f82 96e8020 3e128f6 96e8020 3e128f6 96e8020 3e128f6 96e8020 3e128f6 96e8020 3e128f6 61f2f82 96e8020 61f2f82 96e8020 48bc185 61f2f82 96e8020 3c01bec 96e8020 61f2f82 3e128f6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
license: apache-2.0
tags:
- generated_from_trainer
base_model: facebook/wav2vec2-xls-r-300m
datasets:
- common_voice_15_0
metrics:
- wer
model-index:
- name: wav2vec2-xls-r-300m-br
results:
- task:
type: automatic-speech-recognition
name: Automatic Speech Recognition
dataset:
name: common_voice_15_0
type: common_voice_15_0
config: br
split: None
args: br
metrics:
- type: wer
value: 41
name: WER
- type: cer
value: 14.7
name: CER
language:
- br
pipeline_tag: automatic-speech-recognition
---
# wav2vec2-xls-r-300m-br
This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on Mozilla Common Voice 15 Breton dataset and [Roadennoù](https://github.com/gweltou/roadennou) dataset. It achieves the following results on the MCV15-br test set:
- Wer: 41.0
- Cer: 14.7
## Model description
This model was trained to assess the performance wav2vec2-xls-r-300m for fine-tuning a Breton ASR model.
## Intended uses & limitations
This model is a research model. Usage for production is not recommended.
## Training and evaluation data
The training dataset consists of MCV15-br train dataset and 90% of the Roadennoù dataset.
The validation dataset consists of MCV15-br validation dataset and the remaining 10% of the Roadennoù dataset.
The final test dataset consists of MCV15-br test dataset.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 40
- mixed_precision_training: Native AMP
### Framework versions
- Transformers 4.39.1
- Pytorch 2.0.1+cu117
- Datasets 2.18.0
- Tokenizers 0.15.2 |