Visualize in Weights & Biases

exp2-led-risalah_data_v6

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7971
  • Rouge1: 35.2105
  • Rouge2: 14.2825
  • Rougel: 18.7356
  • Rougelsum: 33.8518

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.6579 1.0 20 2.9715 15.0902 2.4602 8.8016 14.54
3.8322 2.0 40 2.6400 19.8395 3.3421 10.0041 18.8629
3.3928 3.0 60 2.4780 24.1438 4.9177 11.8553 23.0439
3.1159 4.0 80 2.3336 26.1339 5.4015 12.1066 24.5407
2.8469 5.0 100 2.2554 25.3388 5.6665 12.0706 24.1799
2.6486 6.0 120 2.1842 33.8164 9.2363 15.7673 31.606
2.5429 7.0 140 2.1322 32.5361 8.5141 15.3201 30.935
2.3159 8.0 160 2.0631 32.3657 9.171 15.1179 30.9634
2.1821 10.0 200 1.9358 33.1626 10.8072 16.5887 31.0652
2.2141 11.0 220 1.9274 36.3525 13.5885 18.4941 34.9263
2.1213 12.0 240 1.9033 34.4359 11.4335 17.8322 32.5781
1.9791 13.0 260 1.8914 37.0733 14.2739 18.9338 35.5985
1.9504 14.0 280 1.8642 34.7529 13.0325 18.1055 33.257
1.9848 15.0 300 1.8641 35.9266 13.4528 18.459 34.0294
1.845 16.0 320 1.8507 37.7424 15.2488 18.993 35.4955
1.8049 17.0 340 1.8390 36.5023 13.6069 18.4956 34.883
1.8158 18.0 360 1.8393 34.4722 13.6438 18.1636 32.4511
1.8541 19.0 380 1.8395 37.0215 14.3221 19.6743 35.3083
1.7967 20.0 400 1.8403 36.3048 13.3475 19.9887 34.6884
1.7285 21.0 420 1.8394 36.4051 14.3198 19.4997 34.9803
1.7303 22.0 440 1.8287 36.1003 14.166 17.8619 34.3505
1.6976 23.0 460 1.8040 34.3036 12.8173 18.6643 32.6019
1.6916 24.0 480 1.7963 34.7753 14.0332 18.923 33.3743
1.6872 25.0 500 1.8073 37.0718 14.6821 20.1188 35.7824
1.6979 26.0 520 1.8340 37.1726 15.1384 20.2153 36.3188
1.6867 27.0 540 1.8000 37.2831 14.2806 19.1448 36.1598
1.6959 28.0 560 1.7886 34.8414 13.5902 18.5803 33.5383
1.7546 29.0 580 1.8068 37.6551 16.1055 20.2492 36.1177
1.632 30.0 600 1.7971 35.2105 14.2825 18.7356 33.8518

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
162M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.