Lingalingeswaran commited on
Commit
aed56c0
1 Parent(s): 504ae80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -1
README.md CHANGED
@@ -17,4 +17,49 @@ tags:
17
 
18
 
19
 
20
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
 
19
 
20
+ ---
21
+
22
+ # Model Name
23
+ A brief description of the model and its purpose.
24
+
25
+ ## Model Overview
26
+ This model is fine-tuned from `openai/whisper-small` using the [Mozilla Common Voice 17.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0) dataset for language identification and transcription in **Tamil** and **Sinhala**. The model is designed to accurately transcribe spoken audio into text and identify whether the language is Tamil or Sinhala.
27
+
28
+ ### Key Features:
29
+ - **Languages**: Tamil, Sinhala
30
+ - **Base Model**: Whisper-small from OpenAI
31
+ - **Dataset**: Mozilla Common Voice 17.0
32
+
33
+ ## Intended Use
34
+ The model is designed for automatic speech recognition (ASR) in Tamil and Sinhala, making it suitable for transcription and language identification in real-time applications.
35
+
36
+ ## Training Details
37
+ This model was fine-tuned using a subset of the Mozilla Common Voice dataset. The dataset contains `X` samples of Tamil and `Y` samples of Sinhala.
38
+
39
+ ### Fine-tuning Process:
40
+ - The fine-tuning was performed on `Whisper-small`, a smaller version of OpenAI's Whisper model, for reduced latency and higher accuracy for low-resource languages.
41
+ - The model was trained for `Z` epochs on a `Google Colab Pro` environment.
42
+
43
+ ## Performance
44
+ The model achieved a **Word Error Rate (WER)** of `32%` on Tamil and `28%` on Sinhala, using a validation dataset with `X` hours of audio.
45
+ We expect further improvements with continued training.
46
+
47
+ ## Usage
48
+ You can use this model with the following code:
49
+
50
+ ```python
51
+ from transformers import WhisperForConditionalGeneration, WhisperProcessor
52
+ import torch
53
+
54
+ model = WhisperForConditionalGeneration.from_pretrained("your_model_name")
55
+ processor = WhisperProcessor.from_pretrained("your_model_name")
56
+
57
+ # Example audio input
58
+ audio = "path_to_audio_file"
59
+
60
+ inputs = processor(audio, return_tensors="pt", padding="longest")
61
+ with torch.no_grad():
62
+ predicted_ids = model.generate(inputs.input_ids)
63
+
64
+ transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
65
+ print(transcription)