File size: 1,519 Bytes
2a5e258 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
license: apache-2.0
base_model: openai/whisper-large-v3
tags:
- generated_from_trainer
- whisper
datasets:
- techiaith/commonvoice_18_0_cy
metrics:
- wer
model-index:
- name: whisper-large-v3-ft-cv-cy-train-all-plus-other-with-excluded
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: DewiBrynJones/commonvoice_18_0_cy default
type: DewiBrynJones/commonvoice_18_0_cy
args: default
metrics:
- name: Wer
type: wer
value: 0.185
language:
- cy
pipeline_tag: automatic-speech-recognition
---
# whisper-large-v3-ft-cv-cy
This model is a version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) fine-tuned with the
`train_all` and `other_with_excluded` custom splits from [techiaith/commonvoice_18_0_cy](https://huggingface.co/datasets/techiaith/commonvoice_18_0_cy)
It achieves the following results on the Common Voice for Welsh release 18's standard test set:
- WER: 18.50
- CER: 5.32
N.B. this model performs considerably worse on English language speech, but better on Welsh than a [bilingual model](https://huggingface.co/techiaith/whisper-large-v3-ft-cv-cy-en)
## Usage
```python
from transformers import pipeline
transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-ft-cv-cy")
result = transcriber(<path or url to soundfile>)
print (result)
```
`{'text': 'Mae hen wlad fy nhadau yn annwyl i mi.'}` |