seanghay commited on
Commit
4081c29
1 Parent(s): f8f3164

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - km
4
+ license: apache-2.0
5
+ tags:
6
+ - hf-asr-leaderboard
7
+ - generated_from_trainer
8
+ datasets:
9
+ - openslr
10
+ - google/fleurs
11
+ - seanghay/km-speech-corpus
12
+
13
+ metrics:
14
+ - wer
15
+
16
+ model-index:
17
+ - name: Whisper Small Khmer Spaced - Seanghay Yath
18
+ results:
19
+ - task:
20
+ name: Automatic Speech Recognition
21
+ type: automatic-speech-recognition
22
+ dataset:
23
+ name: Google FLEURS
24
+ type: google/fleurs
25
+ config: km_kh
26
+ split: test
27
+ metrics:
28
+ - name: Wer
29
+ type: wer
30
+ value: 0.6165
31
+ ---
32
+
33
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
34
+ should probably proofread and complete it, then remove this comment. -->
35
+
36
+ # whisper-small-khmer-v2
37
+
38
+ This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the None dataset.
39
+ It achieves the following results on the evaluation set:
40
+ - Loss: 0.26
41
+ - Wer: 0.6165
42
+
43
+ ## Model description
44
+
45
+ This model is fine-tuned with Google FLEURS & OpenSLR (SLR42) dataset.
46
+
47
+ - [ggml-model.bin](https://huggingface.co/seanghay/whisper-small-khmer/blob/main/ggml-model.bin)
48
+ - [model.onnx](https://huggingface.co/seanghay/whisper-small-khmer/blob/main/model.onnx)
49
+
50
+ ```python
51
+ from transformers import pipeline
52
+
53
+ pipe = pipeline(
54
+ task="automatic-speech-recognition",
55
+ model="seanghay/whisper-small-khmer",
56
+ )
57
+
58
+ result = pipe("audio.wav",
59
+ generate_kwargs={
60
+ "language":"<|km|>",
61
+ "task":"transcribe"},
62
+ batch_size=16
63
+ )
64
+
65
+ print(result["text"])
66
+ ```
67
+
68
+
69
+ ## whisper.cpp
70
+
71
+
72
+ ### 1. Transcode the input audio to 16kHz PCM
73
+
74
+ ```shell
75
+ ffmpeg -i audio.ogg -ar 16000 -ac 1 -c:a pcm_s16le output.wav
76
+ ```
77
+
78
+ ### 2. Transcribe with whisper.cpp
79
+
80
+ ```shell
81
+ ./main -m ggml-model.bin -f output.wav --print-colors --language km
82
+ ```