Update README.md
Browse files
README.md
CHANGED
@@ -17,11 +17,25 @@ model-index:
|
|
17 |
metrics:
|
18 |
- name: Test WER
|
19 |
type: wer
|
20 |
-
value: 0.
|
21 |
---
|
22 |
# Wav2Vec2-Large-XLSR-53-Moroccan-Darija
|
23 |
|
24 |
-
**wav2vec2-large-xlsr-53** fine-tuned on
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
## Usage
|
27 |
|
@@ -58,3 +72,4 @@ print(transcription)
|
|
58 |
Output: قالت ليا هاد السيد هادا ما كاينش بحالو
|
59 |
|
60 |
email: souregh@gmail.com
|
|
|
|
17 |
metrics:
|
18 |
- name: Test WER
|
19 |
type: wer
|
20 |
+
value: 0.254919
|
21 |
---
|
22 |
# Wav2Vec2-Large-XLSR-53-Moroccan-Darija
|
23 |
|
24 |
+
**wav2vec2-large-xlsr-53** fine-tuned on 27 hours (27 people) of labeled Darija Audios.
|
25 |
+
|
26 |
+
# Old model vs new model
|
27 |
+
|
28 |
+
Old Model:
|
29 |
+
- The model contains numerous incorrect transcriptions as input
|
30 |
+
- There are multiple transcribers.
|
31 |
+
- The audio database is not organized (by gender, age, regions ..).
|
32 |
+
- Wrong wer rate
|
33 |
+
|
34 |
+
New Model:
|
35 |
+
- Transcriptions are now performed by a single individual.
|
36 |
+
- Each hour of audio is pronounced by one person.
|
37 |
+
- Fine-tuning is ongoing 24/7 to enhance accuracy, and we are consistently adding more data to the model every day.
|
38 |
+
- Correct Wer rate
|
39 |
|
40 |
## Usage
|
41 |
|
|
|
72 |
Output: قالت ليا هاد السيد هادا ما كاينش بحالو
|
73 |
|
74 |
email: souregh@gmail.com
|
75 |
+
BOUMEHDI Ahmed
|