seanghay
/

whisper-small-khmer-v2

Automatic Speech Recognition

hf-asr-leaderboard

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

seanghay commited on May 3, 2023

Commit

4081c29

•

1 Parent(s): f8f3164

Create README.md

Files changed (1) hide show

README.md +82 -0

README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+language:
+- km
+license: apache-2.0
+tags:
+- hf-asr-leaderboard
+- generated_from_trainer
+datasets:
+- openslr
+- google/fleurs
+- seanghay/km-speech-corpus
+metrics:
+- wer
+model-index:
+- name: Whisper Small Khmer Spaced - Seanghay Yath
+  results:
+  - task:
+      name: Automatic Speech Recognition
+      type: automatic-speech-recognition
+    dataset:
+      name: Google FLEURS
+      type: google/fleurs
+      config: km_kh
+      split: test
+    metrics:
+    - name: Wer
+      type: wer
+      value: 0.6165
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# whisper-small-khmer-v2
+This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.26
+- Wer: 0.6165
+## Model description
+This model is fine-tuned with Google FLEURS & OpenSLR (SLR42) dataset.
+- [ggml-model.bin](https://huggingface.co/seanghay/whisper-small-khmer/blob/main/ggml-model.bin)
+- [model.onnx](https://huggingface.co/seanghay/whisper-small-khmer/blob/main/model.onnx)
+```python
+from transformers import pipeline
+pipe = pipeline(
+    task="automatic-speech-recognition",
+    model="seanghay/whisper-small-khmer",
+)
+result = pipe("audio.wav",
+  generate_kwargs={
+    "language":"<|km|>",
+    "task":"transcribe"},
+    batch_size=16
+)
+print(result["text"])
+```
+## whisper.cpp
+### 1. Transcode the input audio to 16kHz PCM
+```shell
+ffmpeg -i audio.ogg -ar 16000 -ac 1 -c:a pcm_s16le output.wav
+```
+### 2. Transcribe with whisper.cpp
+```shell
+./main -m ggml-model.bin -f output.wav --print-colors --language km
+```