|
--- |
|
datasets: |
|
- JairamKanna/Tamil-vulnerable-speech |
|
language: |
|
- ta |
|
metrics: |
|
- wer |
|
library_name: transformers |
|
pipeline_tag: automatic-speech-recognition |
|
--- |
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This model is the fine-tuned version of Whisper-large-v2 model for Speech Recognition task for vulnerable individuals in Tamil. |
|
|
|
|
|
#### Preprocessing [optional] |
|
|
|
|
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
** training_args = Seq2SeqTrainingArguments( |
|
output_dir="./pretrainedwhisper-medium-native-v2", # change to a repo name of your choice |
|
per_device_train_batch_size=4, |
|
gradient_accumulation_steps=1, # increase by 2x for every 2x decrease in batch size |
|
learning_rate=1e-5, |
|
warmup_steps=200, |
|
max_steps=2000, |
|
gradient_checkpointing=True, |
|
fp16=True, |
|
evaluation_strategy="steps", |
|
per_device_eval_batch_size=8, |
|
predict_with_generate=True, |
|
generation_max_length=225, |
|
save_steps=500, |
|
eval_steps=500, |
|
logging_steps=25, |
|
report_to=["tensorboard"], |
|
load_best_model_at_end=True, |
|
metric_for_best_model="wer", |
|
greater_is_better=False, |
|
push_to_hub=True, |
|
optim="adamw_bnb_8bit" |
|
) |
|
|
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
WER is the evaluation metrics used here. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|