Edit model card

mobilebert_add_pre-training-complete

This model is a fine-tuned version of google/mobilebert-uncased on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9849
  • Accuracy: 0.4607

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 10
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 128
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 300000

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8119 1.0 1787 4.3700 0.3199
4.2649 2.0 3574 4.0930 0.3445
4.0457 3.0 5361 3.9375 0.3545
3.9099 4.0 7148 3.8534 0.3644
3.8193 5.0 8935 3.7993 0.3669
3.7517 6.0 10722 3.7414 0.3730
3.6983 7.0 12509 3.6737 0.3818
3.6565 8.0 14296 3.6657 0.3794
3.619 9.0 16083 3.6129 0.3869
3.5899 10.0 17870 3.5804 0.3910
3.5597 11.0 19657 3.5432 0.3964
3.5329 12.0 21444 3.5397 0.3958
3.5088 13.0 23231 3.4896 0.4011
3.4904 14.0 25018 3.4731 0.4000
3.4703 15.0 26805 3.4971 0.3994
3.4533 16.0 28592 3.4609 0.4049
3.4369 17.0 30379 3.4411 0.4067
3.423 18.0 32166 3.4219 0.4066
3.4084 19.0 33953 3.4477 0.4014
3.3949 20.0 35740 3.4013 0.4087
3.3811 21.0 37527 3.3642 0.4130
3.3688 22.0 39314 3.4173 0.4031
3.3598 23.0 41101 3.4018 0.4101
3.3484 24.0 42888 3.3499 0.4143
3.3363 25.0 44675 3.3675 0.4119
3.3274 26.0 46462 3.3562 0.4154
3.3161 27.0 48249 3.3487 0.4159
3.3073 28.0 50036 3.3293 0.4159
3.2991 29.0 51823 3.3317 0.4160
3.2899 30.0 53610 3.3058 0.4183
3.2814 31.0 55397 3.2795 0.4235
3.2734 32.0 57184 3.3185 0.4143
3.266 33.0 58971 3.2682 0.4268
3.2578 34.0 60758 3.3145 0.4181
3.2506 35.0 62545 3.2726 0.4230
3.2423 36.0 64332 3.2735 0.4218
3.2359 37.0 66119 3.2845 0.4175
3.2293 38.0 67906 3.3067 0.4193
3.2207 39.0 69693 3.2586 0.4257
3.2138 40.0 71480 3.2543 0.4250
3.2077 41.0 73267 3.2395 0.4226
3.202 42.0 75054 3.2224 0.4270
3.1964 43.0 76841 3.2562 0.4234
3.1925 44.0 78628 3.2544 0.4251
3.1865 45.0 80415 3.2043 0.4353
3.1812 46.0 82202 3.2280 0.4286
3.1744 47.0 83989 3.2174 0.4276
3.1699 48.0 85776 3.1972 0.4317
3.1652 49.0 87563 3.2016 0.4302
3.1609 50.0 89350 3.2018 0.4338
3.1548 51.0 91137 3.1950 0.4327
3.1508 52.0 92924 3.2128 0.4279
3.1478 53.0 94711 3.2027 0.4303
3.1423 54.0 96498 3.1959 0.4312
3.1383 55.0 98285 3.1911 0.4340
3.1336 56.0 100072 3.1914 0.4320
3.129 57.0 101859 3.1855 0.4312
3.1233 58.0 103646 3.1570 0.4337
3.1198 59.0 105433 3.2042 0.4307
3.1153 60.0 107220 3.1370 0.4390
3.1122 61.0 109007 3.1612 0.4412
3.1093 62.0 110794 3.1642 0.4348
3.1048 63.0 112581 3.1807 0.4326
3.1013 64.0 114368 3.1449 0.4359
3.0977 65.0 116155 3.1408 0.4380
3.0926 66.0 117942 3.1723 0.4365
3.0901 67.0 119729 3.1473 0.4380
3.0882 68.0 121516 3.1401 0.4378
3.0839 69.0 123303 3.1281 0.4374
3.0794 70.0 125090 3.1356 0.4367
3.0766 71.0 126877 3.1019 0.4397
3.074 72.0 128664 3.1626 0.4355
3.0702 73.0 130451 3.1287 0.4387
3.0676 74.0 132238 3.1366 0.4379
3.0648 75.0 134025 3.1782 0.4346
3.0624 76.0 135812 3.1229 0.4427
3.0575 77.0 137599 3.1139 0.4430
3.0549 78.0 139386 3.0948 0.4431
3.052 79.0 141173 3.1030 0.4452
3.0527 80.0 142960 3.0929 0.4448
3.0466 81.0 144747 3.0888 0.4428
3.0439 82.0 146534 3.1035 0.4414
3.0409 83.0 148321 3.1112 0.4411
3.041 84.0 150108 3.1296 0.4399
3.0379 85.0 151895 3.1224 0.4428
3.0332 86.0 153682 3.1101 0.4398
3.0315 87.0 155469 3.1045 0.4423
3.0302 88.0 157256 3.0913 0.4446
3.0265 89.0 159043 3.0745 0.4447
3.0243 90.0 160830 3.0942 0.4443
3.0222 91.0 162617 3.0821 0.4432
3.021 92.0 164404 3.0616 0.4473
3.0183 93.0 166191 3.1021 0.4450
3.0155 94.0 167978 3.1163 0.4422
3.0132 95.0 169765 3.0645 0.4493
3.0118 96.0 171552 3.0922 0.4420
3.0105 97.0 173339 3.1187 0.4423
3.0063 98.0 175126 3.1061 0.4462
3.0035 99.0 176913 3.1098 0.4424
3.0025 100.0 178700 3.0856 0.4454
3.0001 101.0 180487 3.0584 0.4504
2.9979 102.0 182274 3.0897 0.4435
2.9963 103.0 184061 3.0712 0.4437
2.9944 104.0 185848 3.0853 0.4458
2.9931 105.0 187635 3.0809 0.4475
2.992 106.0 189422 3.0910 0.4426
2.9886 107.0 191209 3.0693 0.4490
2.986 108.0 192996 3.0906 0.4445
2.9834 109.0 194783 3.0320 0.4538
2.9829 110.0 196570 3.0760 0.4456
2.9814 111.0 198357 3.0423 0.4504
2.9795 112.0 200144 3.0411 0.4529
2.979 113.0 201931 3.0784 0.4463
2.9781 114.0 203718 3.0526 0.4537
2.9751 115.0 205505 3.0479 0.4512
2.9749 116.0 207292 3.0545 0.4493
2.9735 117.0 209079 3.0529 0.4485
2.9705 118.0 210866 3.0080 0.4581
2.9698 119.0 212653 3.0271 0.4537
2.9674 120.0 214440 3.0477 0.4482
2.9666 121.0 216227 3.0328 0.4558
2.9664 122.0 218014 3.0689 0.4463
2.9639 123.0 219801 3.0749 0.4459
2.9633 124.0 221588 3.0505 0.4489
2.9618 125.0 223375 3.0256 0.4535
2.9589 126.0 225162 3.0522 0.4496
2.9584 127.0 226949 3.0451 0.4530
2.9589 128.0 228736 3.0654 0.4502
2.9581 129.0 230523 2.9989 0.4580
2.9554 130.0 232310 3.0347 0.4508
2.9565 131.0 234097 3.0586 0.4498
2.9548 132.0 235884 3.0170 0.4536
2.9515 133.0 237671 3.0470 0.4492
2.9499 134.0 239458 3.0339 0.4515
2.9514 135.0 241245 3.0474 0.4473
2.9486 136.0 243032 3.0427 0.4493
2.9483 137.0 244819 3.0336 0.4534
2.9491 138.0 246606 3.0274 0.4516
2.9465 139.0 248393 3.0354 0.4539
2.9447 140.0 250180 3.0139 0.4526
2.9449 141.0 251967 3.0163 0.4548
2.9439 142.0 253754 3.0308 0.4534
2.9435 143.0 255541 3.0242 0.4579
2.943 144.0 257328 3.0437 0.4513
2.943 145.0 259115 3.0227 0.4544
2.9403 146.0 260902 3.0464 0.4478
2.9407 147.0 262689 3.0718 0.4465
2.9397 148.0 264476 3.0519 0.4487
2.9392 149.0 266263 3.0163 0.4558
2.9377 150.0 268050 3.0159 0.4518
2.9386 151.0 269837 3.0010 0.4545
2.9391 152.0 271624 3.0346 0.4530
2.9364 153.0 273411 3.0039 0.4541
2.9359 154.0 275198 3.0417 0.4519
2.9359 155.0 276985 3.0161 0.4544
2.936 156.0 278772 3.0169 0.4534
2.9329 157.0 280559 3.0594 0.4478
2.9336 158.0 282346 3.0265 0.4555
2.9341 159.0 284133 3.0276 0.4542
2.933 160.0 285920 3.0324 0.4524
2.9325 161.0 287707 3.0249 0.4489
2.932 162.0 289494 3.0444 0.4519
2.9334 163.0 291281 3.0420 0.4494
2.9318 164.0 293068 2.9972 0.4541
2.9316 165.0 294855 2.9973 0.4526
2.9318 166.0 296642 3.0389 0.4529
2.9301 167.0 298429 3.0131 0.4557
2.9291 167.88 300000 3.0067 0.4548

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train gokuls/mobilebert_add_pre-training-complete

Evaluation results