Automatic Speech Recognition
Welsh
whispercpp
Edit model card

whisper-base-ft-btb-cv-cy-cpp

This model is a version of the openai/whisper-base model, fine-tuned with transcriptions of Welsh language spontaneous speech from Banc Trawsgrifiadau Bangor (btb) dataset, as well as read speech from Welsh Common Voice version 18 (cv) for additional training, and then converted for use in whisper.cpp.

Whispercpp is a C/C++ port of Whisper that provides high performance inference on hardware such as desktops, laptops and mobile devices, thus giving an offline option.

The model is a smaller in size to the corresponding model for hosting on cloud GPU based infrastructure techiaith/whisper-large-v3-ft-btb-cv-cy and thus not as accurate.

It achieves the following WER results for transcribing Welsh language spontaneous speech:

  • WER: 62.76
  • CER: 27.70

Usage

whispercpp makes it easy to use models in many platforms and applications. See the 'examples' folder in the whispercpp github repo for more information and example code.

To get quickly started with whispercpp's basic usage however, follow the 'Quick Start' but download this model with the following command:

$ wget https://huggingface.co/techiaith/whisper-base-ft-btb-cv-cy-cpp/resolve/main/ggml-model.bin

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for techiaith/whisper-base-ft-btb-cv-cy-cpp

Finetuned
(361)
this model

Datasets used to train techiaith/whisper-base-ft-btb-cv-cy-cpp

Collection including techiaith/whisper-base-ft-btb-cv-cy-cpp