update model

Files changed (4) hide show

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ license: mit
 ## Distilled Medium Whisper ASR Model for Thai
 ### Model Description
-This is a distilled Automatic Speech Recognition (ASR) model, based on the Whisper architecture. It has been specifically tailored for Thai language speech recognition. The model features 2 decoder layers (vs 24 in teacher model) and has been distilled from a larger teacher model, focusing on enhancing performance and efficiency.
 #### Distillation Details
 - **Teacher Model**: Medium Whisper ASR model
@@ -18,10 +18,10 @@ This is a distilled Automatic Speech Recognition (ASR) model, based on the Whisp
 ### Model Performance
 - **DeepCut Tokenized WER on Common Voice 13 Test Set**:
-  - Distilled Model: **17.2%**
   - Teacher Model: **8.92%**
-Reducing the decoder layers to just 2 layers hurts WER significantly for Thai speech. Additional datasets for distillation or more decoder layers might improve the WER. More to come soon!
 ### Intended Use
 This model is intended for use in applications requiring Thai language speech recognition.

 ## Distilled Medium Whisper ASR Model for Thai
 ### Model Description
+This is a distilled Automatic Speech Recognition (ASR) model, based on the Whisper architecture. It has been specifically tailored for Thai language speech recognition. The model features 4 decoder layers (vs 24 in teacher model) and has been distilled from a larger teacher model, focusing on enhancing performance and efficiency.
 #### Distillation Details
 - **Teacher Model**: Medium Whisper ASR model
 ### Model Performance
 - **DeepCut Tokenized WER on Common Voice 13 Test Set**:
+  - Distilled Model: **10.49%**
   - Teacher Model: **8.92%**
+Additional datasets for distillation or more decoder layers might improve the WER. More to come soon!
 ### Intended Use
 This model is intended for use in applications requiring Thai language speech recognition.

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "../../distil-whisper-medium-aug-th-init",
   "activation_dropout": 0.0,
   "activation_function": "gelu",
   "apply_spec_augment": true,
@@ -17,7 +17,7 @@
   "decoder_attention_heads": 16,
   "decoder_ffn_dim": 4096,
   "decoder_layerdrop": 0.0,
-  "decoder_layers": 2,
   "decoder_start_token_id": 50258,
   "dropout": 0.0,
   "encoder_attention_heads": 16,

 {
+  "_name_or_path": "./distil-whisper-medium-init",
   "activation_dropout": 0.0,
   "activation_function": "gelu",
   "apply_spec_augment": true,
   "decoder_attention_heads": 16,
   "decoder_ffn_dim": 4096,
   "decoder_layerdrop": 0.0,
+  "decoder_layers": 4,
   "decoder_start_token_id": 50258,
   "dropout": 0.0,
   "encoder_attention_heads": 16,

distil-whisper/events.out.tfevents.1705422649.c175d7640af5.1725615.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:59683a04b97cfdaebbab875018dc06dd25b96e82190613b672dee6676d497fce
+size 6288

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fa97226df5323c516124e99226d2519e2d369c2b0dcabf78233e19e9e97d7409
-size 1577553712

 version https://git-lfs.github.com/spec/v1
+oid sha256:280d23deb47a68a4173948b4759cc2d02884df78ef198594560c95a7e48fac10
+size 1711916448