Edit model card

SpeechT5 TTS urdu

This model is a fine-tuned version of microsoft/speecht5_tts on the common_voice_urdu1 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4796

Model description

trianed using roman urdu, using a transliteration function normal urdu was mapped to roman urdu.

Use

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • training_steps: 10500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.5782 4.3103 500 0.5071
0.5248 8.6207 1000 0.4863
0.5125 12.9310 1500 0.4746
0.5081 17.2414 2000 0.4727
0.4967 21.5517 2500 0.4683
0.4905 25.8621 3000 0.4645
0.4794 30.1724 3500 0.4668
0.4829 34.4828 4000 0.4647
0.477 38.7931 4500 0.4645
0.4637 43.1034 5000 0.4710
0.4743 47.4138 5500 0.4683
0.4595 51.7241 6000 0.4695
0.4735 56.0345 6500 0.4684
0.4613 60.3448 7000 0.4724
0.4678 64.6552 7500 0.4732
0.4538 68.9655 8000 0.4723
0.4536 73.2759 8500 0.4747
0.4587 77.5862 9000 0.4740
0.4536 81.8966 9500 0.4762
0.4606 86.2069 10000 0.4768
0.4528 90.5172 10500 0.4796

Framework versions

  • Transformers 4.43.0.dev0
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
144M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pocketmonkey/speecht5_tts_urdu

Finetuned
(760)
this model

Dataset used to train pocketmonkey/speecht5_tts_urdu