Edit model card

ASR4-for-40-epochs

This model is a fine-tuned version of facebook/mms-1b-all on the HTV news dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4791
  • Wer: 0.2684

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
5.1111 0.92 100 0.7687 0.4387
1.1201 1.83 200 0.6388 0.3767
0.9734 2.75 300 0.6319 0.3658
0.9297 3.67 400 0.5740 0.3373
0.9142 4.59 500 0.5591 0.3268
0.8462 5.5 600 0.5627 0.3227
0.8366 6.42 700 0.5491 0.3158
0.8272 7.34 800 0.5398 0.3243
0.8137 8.26 900 0.5363 0.3113
0.7643 9.17 1000 0.5528 0.3117
0.7738 10.09 1100 0.5194 0.3285
0.7622 11.01 1200 0.5348 0.3043
0.707 11.93 1300 0.5179 0.2909
0.7242 12.84 1400 0.5153 0.3138
0.7093 13.76 1500 0.5116 0.2951
0.673 14.68 1600 0.5002 0.2941
0.6877 15.6 1700 0.4958 0.3050
0.6665 16.51 1800 0.5032 0.2865
0.6507 17.43 1900 0.4871 0.2809
0.6308 18.35 2000 0.4953 0.2947
0.6507 19.27 2100 0.4998 0.2837
0.6027 20.18 2200 0.4963 0.2868
0.623 21.1 2300 0.4955 0.2953
0.6047 22.02 2400 0.5034 0.2852
0.5825 22.94 2500 0.4781 0.2795
0.585 23.85 2600 0.4851 0.2843
0.5838 24.77 2700 0.4957 0.2742
0.5718 25.69 2800 0.4885 0.2810
0.5646 26.61 2900 0.4778 0.2724
0.5476 27.52 3000 0.4914 0.2751
0.5333 28.44 3100 0.4879 0.2788
0.5533 29.36 3200 0.4820 0.2726
0.5321 30.28 3300 0.4816 0.2686
0.5161 31.19 3400 0.4865 0.2812
0.5326 32.11 3500 0.4818 0.2704
0.5188 33.03 3600 0.4816 0.2669
0.506 33.94 3700 0.4804 0.2755
0.5122 34.86 3800 0.4803 0.2667
0.506 35.78 3900 0.4785 0.2708
0.5064 36.7 4000 0.4755 0.2730
0.4997 37.61 4100 0.4804 0.2708
0.4904 38.53 4200 0.4772 0.2678
0.4774 39.45 4300 0.4791 0.2684

Framework versions

  • Transformers 4.37.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
5
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ducha07/way2vec2-VNmese

Finetuned
(132)
this model

Dataset used to train ducha07/way2vec2-VNmese

Evaluation results