led-risalah_data_v17_2

This model is a fine-tuned version of silmi224/finetune-led-35000 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6183
  • Rouge1: 24.9438
  • Rouge2: 12.823
  • Rougel: 19.4874
  • Rougelsum: 23.9852

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 400
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.0463 1.0 20 2.6201 10.5455 3.3808 8.0147 9.7658
2.7524 2.0 40 2.3724 13.2211 4.2659 9.4319 11.4823
2.4437 3.0 60 2.1687 15.9732 4.6109 10.6226 14.1832
2.1607 4.0 80 2.0550 17.731 6.3571 10.6519 16.6744
2.0465 5.0 100 1.9641 19.3209 6.788 12.3334 17.2773
1.8932 6.0 120 1.8951 20.2099 9.1781 14.4373 18.5711
1.8485 7.0 140 1.8391 17.9081 7.2188 12.0437 16.1709
1.7211 8.0 160 1.7814 20.2991 8.2239 13.6757 18.9692
1.6461 9.0 180 1.7475 25.3547 10.5964 16.5484 23.7821
1.6109 10.0 200 1.7211 22.2062 9.3952 15.2277 21.1163
1.5818 11.0 220 1.7049 22.8022 9.2525 15.8587 21.4785
1.5194 12.0 240 1.6829 23.9497 11.1116 16.8015 22.8818
1.4541 13.0 260 1.6700 23.3403 11.4888 16.9861 22.4228
1.3816 14.0 280 1.6555 25.8179 13.2041 17.7017 24.7336
1.3908 15.0 300 1.6451 25.697 13.4504 18.41 24.7942
1.364 16.0 320 1.6224 25.7576 11.9706 17.695 24.2206
1.2521 17.0 340 1.6094 24.1556 12.942 18.5932 23.2197
1.2384 18.0 360 1.6041 25.1035 12.7288 18.2081 24.4216
1.2734 19.0 380 1.6075 25.482 13.4025 19.7018 25.1256
1.1228 20.0 400 1.6183 24.9438 12.823 19.4874 23.9852

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
10
Safetensors
Model size
162M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for silmi224/led-risalah_data_v17_2

Finetuned
(15)
this model