--- language: - km license: apache-2.0 tags: - hf-asr-leaderboard - generated_from_trainer datasets: - openslr - google/fleurs - seanghay/km-speech-corpus metrics: - wer model-index: - name: Whisper Small Khmer Spaced - Seanghay Yath results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Google FLEURS type: google/fleurs config: km_kh split: test metrics: - name: Wer type: wer value: 0.6165 --- # whisper-small-khmer-v2 This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.26 - Wer: 0.6165 ## Model description This model is fine-tuned with Google FLEURS & OpenSLR (SLR42) dataset. - [ggml-model.bin](https://huggingface.co/seanghay/whisper-small-khmer/blob/main/ggml-model.bin) - [model.onnx](https://huggingface.co/seanghay/whisper-small-khmer/blob/main/model.onnx) ```python from transformers import pipeline pipe = pipeline( task="automatic-speech-recognition", model="seanghay/whisper-small-khmer", ) result = pipe("audio.wav", generate_kwargs={ "language":"<|km|>", "task":"transcribe"}, batch_size=16 ) print(result["text"]) ``` ## whisper.cpp ### 1. Transcode the input audio to 16kHz PCM ```shell ffmpeg -i audio.ogg -ar 16000 -ac 1 -c:a pcm_s16le output.wav ``` ### 2. Transcribe with whisper.cpp ```shell ./main -m ggml-model.bin -f output.wav --print-colors --language km ```