Erkhembayar Gantulga commited on
Commit
1899cc9
1 Parent(s): 6700b86

Updated README

Browse files

Added training data information

Files changed (1) hide show
  1. README.md +39 -2
README.md CHANGED
@@ -3,6 +3,9 @@ language:
3
  - mn
4
  base_model: openai/whisper-medium
5
  library_name: transformers
 
 
 
6
  tags:
7
  - audio
8
  - automatic-speech-recognition
@@ -37,7 +40,7 @@ should probably proofread and complete it, then remove this comment. -->
37
 
38
  # Whisper Medium Mn - Erkhembayar Gantulga
39
 
40
- This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the Common Voice 17.0 dataset.
41
  It achieves the following results on the evaluation set:
42
  - Loss: 0.1083
43
  - Wer: 12.9580
@@ -52,7 +55,41 @@ More information needed
52
 
53
  ## Training and evaluation data
54
 
55
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
  ## Training procedure
58
 
 
3
  - mn
4
  base_model: openai/whisper-medium
5
  library_name: transformers
6
+ datasets:
7
+ - mozilla-foundation/common_voice_17_0
8
+ - google/fleurs
9
  tags:
10
  - audio
11
  - automatic-speech-recognition
 
40
 
41
  # Whisper Medium Mn - Erkhembayar Gantulga
42
 
43
+ This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the Common Voice 17.0 and Google Fleurs datasets.
44
  It achieves the following results on the evaluation set:
45
  - Loss: 0.1083
46
  - Wer: 12.9580
 
55
 
56
  ## Training and evaluation data
57
 
58
+ Datasets used for training:
59
+ - [Common Voice 17.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0)
60
+ - [Google Fleurs](https://huggingface.co/datasets/google/fleurs)
61
+
62
+ For training, combined Common Voice 17.0 and Google Fleurs datasets:
63
+
64
+ ```
65
+ from datasets import load_dataset, DatasetDict, concatenate_datasets
66
+ from datasets import Audio
67
+
68
+ common_voice = DatasetDict()
69
+
70
+ common_voice["train"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="train+validation+validated", use_auth_token=True)
71
+ common_voice["test"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="test", use_auth_token=True)
72
+
73
+ common_voice = common_voice.cast_column("audio", Audio(sampling_rate=16000))
74
+
75
+ common_voice = common_voice.remove_columns(
76
+ ["accent", "age", "client_id", "down_votes", "gender", "locale", "path", "segment", "up_votes", "variant"]
77
+ )
78
+
79
+ google_fleurs = DatasetDict()
80
+
81
+ google_fleurs["train"] = load_dataset("google/fleurs", "mn_mn", split="train+validation", use_auth_token=True)
82
+ google_fleurs["test"] = load_dataset("google/fleurs", "mn_mn", split="test", use_auth_token=True)
83
+
84
+ google_fleurs = google_fleurs.remove_columns(
85
+ ["id", "num_samples", "path", "raw_transcription", "gender", "lang_id", "language", "lang_group_id"]
86
+ )
87
+ google_fleurs = google_fleurs.rename_column("transcription", "sentence")
88
+
89
+ dataset = DatasetDict()
90
+ dataset["train"] = concatenate_datasets([common_voice["train"], google_fleurs["train"]])
91
+ dataset["test"] = concatenate_datasets([common_voice["test"], google_fleurs["test"]])
92
+ ```
93
 
94
  ## Training procedure
95