Edit model card

sinhala_albert

This model is a fine-tuned version of albert-base-v2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.5337

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
1.0056 1.0 83 1.0130
0.9992 2.0 166 1.0021
0.9774 3.0 249 0.9789
0.9323 4.0 332 0.9695
0.863 5.0 415 0.9616
0.7944 6.0 498 0.9871
0.6328 7.0 581 1.0075
0.4705 8.0 664 1.4911
0.2834 9.0 747 1.5777
0.2278 10.0 830 1.6406
0.1751 11.0 913 1.7568
0.1657 12.0 996 1.7089
0.0974 13.0 1079 1.8463
0.1562 14.0 1162 1.9219
0.118 15.0 1245 1.9384
0.1044 16.0 1328 1.9971
0.0952 17.0 1411 2.1732
0.0877 18.0 1494 2.0566
0.0598 19.0 1577 2.4616
0.0762 20.0 1660 2.2672
0.1003 21.0 1743 2.3471
0.0627 22.0 1826 2.2526
0.0584 23.0 1909 2.7092
0.0679 24.0 1992 2.1629
0.0538 25.0 2075 2.5745
0.0723 26.0 2158 2.5667
0.0564 27.0 2241 2.4331
0.0662 28.0 2324 2.7811
0.0226 29.0 2407 2.8163
0.0224 30.0 2490 2.7452
0.0344 31.0 2573 2.6642
0.0519 32.0 2656 2.3490
0.0478 33.0 2739 2.7382
0.0436 34.0 2822 2.7556
0.0474 35.0 2905 2.5449
0.0355 36.0 2988 2.8280
0.0343 37.0 3071 2.9405
0.0283 38.0 3154 2.9870
0.0446 39.0 3237 3.0252
0.0288 40.0 3320 3.0866
0.0134 41.0 3403 3.1549
0.0328 42.0 3486 3.0168
0.0159 43.0 3569 2.8753
0.0155 44.0 3652 3.3455
0.0087 45.0 3735 3.4373
0.0296 46.0 3818 3.1949
0.0085 47.0 3901 3.1817
0.0048 48.0 3984 3.2233
0.0122 49.0 4067 3.5465
0.0024 50.0 4150 3.5276
0.0014 51.0 4233 3.5111
0.0121 52.0 4316 3.4483
0.0159 53.0 4399 3.8072
0.0027 54.0 4482 3.7448
0.0059 55.0 4565 3.9230
0.0083 56.0 4648 3.9245
0.0128 57.0 4731 3.8699
0.0116 58.0 4814 3.9957
0.0013 59.0 4897 3.8153
0.0013 60.0 4980 3.9093
0.0035 61.0 5063 4.0339
0.0028 62.0 5146 3.9929
0.0036 63.0 5229 4.1217
0.004 64.0 5312 4.0936
0.0014 65.0 5395 4.1109
0.0047 66.0 5478 4.1978
0.0005 67.0 5561 4.2320
0.0009 68.0 5644 4.2441
0.0027 69.0 5727 4.2670
0.0008 70.0 5810 4.2923
0.0013 71.0 5893 4.3101
0.0006 72.0 5976 4.3561
0.0024 73.0 6059 4.3419
0.0014 74.0 6142 4.3432
0.0011 75.0 6225 4.3302
0.0 76.0 6308 4.3439
0.0016 77.0 6391 4.3667
0.0026 78.0 6474 4.3803
0.0031 79.0 6557 4.3800
0.002 80.0 6640 4.3941
0.0008 81.0 6723 4.4071
0.0019 82.0 6806 4.4259
0.0013 83.0 6889 4.4436
0.0015 84.0 6972 4.4603
0.0009 85.0 7055 4.4706
0.0019 86.0 7138 4.4701
0.001 87.0 7221 4.4777
0.0007 88.0 7304 4.4905
0.0021 89.0 7387 4.4910
0.0012 90.0 7470 4.4959
0.0 91.0 7553 4.4990
0.0024 92.0 7636 4.5091
0.0031 93.0 7719 4.5115
0.0011 94.0 7802 4.5215
0.0 95.0 7885 4.5152
0.002 96.0 7968 4.5200
0.0013 97.0 8051 4.5293
0.0013 98.0 8134 4.5285
0.0023 99.0 8217 4.5339
0.002 100.0 8300 4.5337

Framework versions

  • Transformers 4.41.0.dev0
  • Pytorch 2.2.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.19.1
Downloads last month
6
Safetensors
Model size
11.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for theekshana/sinhala_albert

Finetuned
(161)
this model

Collection including theekshana/sinhala_albert