seanghay's picture
Create README.md
4081c29
|
raw
history blame
1.77 kB
metadata
language:
  - km
license: apache-2.0
tags:
  - hf-asr-leaderboard
  - generated_from_trainer
datasets:
  - openslr
  - google/fleurs
  - seanghay/km-speech-corpus
metrics:
  - wer
model-index:
  - name: Whisper Small Khmer Spaced - Seanghay Yath
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Google FLEURS
          type: google/fleurs
          config: km_kh
          split: test
        metrics:
          - name: Wer
            type: wer
            value: 0.6165

whisper-small-khmer-v2

This model is a fine-tuned version of openai/whisper-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.26
  • Wer: 0.6165

Model description

This model is fine-tuned with Google FLEURS & OpenSLR (SLR42) dataset.

from transformers import pipeline

pipe = pipeline(
    task="automatic-speech-recognition",
    model="seanghay/whisper-small-khmer",
)

result = pipe("audio.wav",
  generate_kwargs={
    "language":"<|km|>",
    "task":"transcribe"},
    batch_size=16
)

print(result["text"])

whisper.cpp

1. Transcode the input audio to 16kHz PCM

ffmpeg -i audio.ogg -ar 16000 -ac 1 -c:a pcm_s16le output.wav

2. Transcribe with whisper.cpp

./main -m ggml-model.bin -f output.wav --print-colors --language km