metadata
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- wikitext
metrics:
- accuracy
model-index:
- name: mobilebert_add_pre-training-complete
results:
- task:
name: Masked Language Modeling
type: fill-mask
dataset:
name: wikitext
type: wikitext
config: wikitext-103-raw-v1
split: validation
args: wikitext-103-raw-v1
metrics:
- name: Accuracy
type: accuracy
value: 0.4548123706727812
mobilebert_add_pre-training-complete
This model is a fine-tuned version of google/mobilebert-uncased on the wikitext dataset. It achieves the following results on the evaluation set:
- Loss: 3.0067
- Accuracy: 0.4548
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 10
- distributed_type: multi-GPU
- num_devices: 2
- total_train_batch_size: 128
- total_eval_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- training_steps: 300000
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
4.8119 | 1.0 | 1787 | 4.3700 | 0.3199 |
4.2649 | 2.0 | 3574 | 4.0930 | 0.3445 |
4.0457 | 3.0 | 5361 | 3.9375 | 0.3545 |
3.9099 | 4.0 | 7148 | 3.8534 | 0.3644 |
3.8193 | 5.0 | 8935 | 3.7993 | 0.3669 |
3.7517 | 6.0 | 10722 | 3.7414 | 0.3730 |
3.6983 | 7.0 | 12509 | 3.6737 | 0.3818 |
3.6565 | 8.0 | 14296 | 3.6657 | 0.3794 |
3.619 | 9.0 | 16083 | 3.6129 | 0.3869 |
3.5899 | 10.0 | 17870 | 3.5804 | 0.3910 |
3.5597 | 11.0 | 19657 | 3.5432 | 0.3964 |
3.5329 | 12.0 | 21444 | 3.5397 | 0.3958 |
3.5088 | 13.0 | 23231 | 3.4896 | 0.4011 |
3.4904 | 14.0 | 25018 | 3.4731 | 0.4000 |
3.4703 | 15.0 | 26805 | 3.4971 | 0.3994 |
3.4533 | 16.0 | 28592 | 3.4609 | 0.4049 |
3.4369 | 17.0 | 30379 | 3.4411 | 0.4067 |
3.423 | 18.0 | 32166 | 3.4219 | 0.4066 |
3.4084 | 19.0 | 33953 | 3.4477 | 0.4014 |
3.3949 | 20.0 | 35740 | 3.4013 | 0.4087 |
3.3811 | 21.0 | 37527 | 3.3642 | 0.4130 |
3.3688 | 22.0 | 39314 | 3.4173 | 0.4031 |
3.3598 | 23.0 | 41101 | 3.4018 | 0.4101 |
3.3484 | 24.0 | 42888 | 3.3499 | 0.4143 |
3.3363 | 25.0 | 44675 | 3.3675 | 0.4119 |
3.3274 | 26.0 | 46462 | 3.3562 | 0.4154 |
3.3161 | 27.0 | 48249 | 3.3487 | 0.4159 |
3.3073 | 28.0 | 50036 | 3.3293 | 0.4159 |
3.2991 | 29.0 | 51823 | 3.3317 | 0.4160 |
3.2899 | 30.0 | 53610 | 3.3058 | 0.4183 |
3.2814 | 31.0 | 55397 | 3.2795 | 0.4235 |
3.2734 | 32.0 | 57184 | 3.3185 | 0.4143 |
3.266 | 33.0 | 58971 | 3.2682 | 0.4268 |
3.2578 | 34.0 | 60758 | 3.3145 | 0.4181 |
3.2506 | 35.0 | 62545 | 3.2726 | 0.4230 |
3.2423 | 36.0 | 64332 | 3.2735 | 0.4218 |
3.2359 | 37.0 | 66119 | 3.2845 | 0.4175 |
3.2293 | 38.0 | 67906 | 3.3067 | 0.4193 |
3.2207 | 39.0 | 69693 | 3.2586 | 0.4257 |
3.2138 | 40.0 | 71480 | 3.2543 | 0.4250 |
3.2077 | 41.0 | 73267 | 3.2395 | 0.4226 |
3.202 | 42.0 | 75054 | 3.2224 | 0.4270 |
3.1964 | 43.0 | 76841 | 3.2562 | 0.4234 |
3.1925 | 44.0 | 78628 | 3.2544 | 0.4251 |
3.1865 | 45.0 | 80415 | 3.2043 | 0.4353 |
3.1812 | 46.0 | 82202 | 3.2280 | 0.4286 |
3.1744 | 47.0 | 83989 | 3.2174 | 0.4276 |
3.1699 | 48.0 | 85776 | 3.1972 | 0.4317 |
3.1652 | 49.0 | 87563 | 3.2016 | 0.4302 |
3.1609 | 50.0 | 89350 | 3.2018 | 0.4338 |
3.1548 | 51.0 | 91137 | 3.1950 | 0.4327 |
3.1508 | 52.0 | 92924 | 3.2128 | 0.4279 |
3.1478 | 53.0 | 94711 | 3.2027 | 0.4303 |
3.1423 | 54.0 | 96498 | 3.1959 | 0.4312 |
3.1383 | 55.0 | 98285 | 3.1911 | 0.4340 |
3.1336 | 56.0 | 100072 | 3.1914 | 0.4320 |
3.129 | 57.0 | 101859 | 3.1855 | 0.4312 |
3.1233 | 58.0 | 103646 | 3.1570 | 0.4337 |
3.1198 | 59.0 | 105433 | 3.2042 | 0.4307 |
3.1153 | 60.0 | 107220 | 3.1370 | 0.4390 |
3.1122 | 61.0 | 109007 | 3.1612 | 0.4412 |
3.1093 | 62.0 | 110794 | 3.1642 | 0.4348 |
3.1048 | 63.0 | 112581 | 3.1807 | 0.4326 |
3.1013 | 64.0 | 114368 | 3.1449 | 0.4359 |
3.0977 | 65.0 | 116155 | 3.1408 | 0.4380 |
3.0926 | 66.0 | 117942 | 3.1723 | 0.4365 |
3.0901 | 67.0 | 119729 | 3.1473 | 0.4380 |
3.0882 | 68.0 | 121516 | 3.1401 | 0.4378 |
3.0839 | 69.0 | 123303 | 3.1281 | 0.4374 |
3.0794 | 70.0 | 125090 | 3.1356 | 0.4367 |
3.0766 | 71.0 | 126877 | 3.1019 | 0.4397 |
3.074 | 72.0 | 128664 | 3.1626 | 0.4355 |
3.0702 | 73.0 | 130451 | 3.1287 | 0.4387 |
3.0676 | 74.0 | 132238 | 3.1366 | 0.4379 |
3.0648 | 75.0 | 134025 | 3.1782 | 0.4346 |
3.0624 | 76.0 | 135812 | 3.1229 | 0.4427 |
3.0575 | 77.0 | 137599 | 3.1139 | 0.4430 |
3.0549 | 78.0 | 139386 | 3.0948 | 0.4431 |
3.052 | 79.0 | 141173 | 3.1030 | 0.4452 |
3.0527 | 80.0 | 142960 | 3.0929 | 0.4448 |
3.0466 | 81.0 | 144747 | 3.0888 | 0.4428 |
3.0439 | 82.0 | 146534 | 3.1035 | 0.4414 |
3.0409 | 83.0 | 148321 | 3.1112 | 0.4411 |
3.041 | 84.0 | 150108 | 3.1296 | 0.4399 |
3.0379 | 85.0 | 151895 | 3.1224 | 0.4428 |
3.0332 | 86.0 | 153682 | 3.1101 | 0.4398 |
3.0315 | 87.0 | 155469 | 3.1045 | 0.4423 |
3.0302 | 88.0 | 157256 | 3.0913 | 0.4446 |
3.0265 | 89.0 | 159043 | 3.0745 | 0.4447 |
3.0243 | 90.0 | 160830 | 3.0942 | 0.4443 |
3.0222 | 91.0 | 162617 | 3.0821 | 0.4432 |
3.021 | 92.0 | 164404 | 3.0616 | 0.4473 |
3.0183 | 93.0 | 166191 | 3.1021 | 0.4450 |
3.0155 | 94.0 | 167978 | 3.1163 | 0.4422 |
3.0132 | 95.0 | 169765 | 3.0645 | 0.4493 |
3.0118 | 96.0 | 171552 | 3.0922 | 0.4420 |
3.0105 | 97.0 | 173339 | 3.1187 | 0.4423 |
3.0063 | 98.0 | 175126 | 3.1061 | 0.4462 |
3.0035 | 99.0 | 176913 | 3.1098 | 0.4424 |
3.0025 | 100.0 | 178700 | 3.0856 | 0.4454 |
3.0001 | 101.0 | 180487 | 3.0584 | 0.4504 |
2.9979 | 102.0 | 182274 | 3.0897 | 0.4435 |
2.9963 | 103.0 | 184061 | 3.0712 | 0.4437 |
2.9944 | 104.0 | 185848 | 3.0853 | 0.4458 |
2.9931 | 105.0 | 187635 | 3.0809 | 0.4475 |
2.992 | 106.0 | 189422 | 3.0910 | 0.4426 |
2.9886 | 107.0 | 191209 | 3.0693 | 0.4490 |
2.986 | 108.0 | 192996 | 3.0906 | 0.4445 |
2.9834 | 109.0 | 194783 | 3.0320 | 0.4538 |
2.9829 | 110.0 | 196570 | 3.0760 | 0.4456 |
2.9814 | 111.0 | 198357 | 3.0423 | 0.4504 |
2.9795 | 112.0 | 200144 | 3.0411 | 0.4529 |
2.979 | 113.0 | 201931 | 3.0784 | 0.4463 |
2.9781 | 114.0 | 203718 | 3.0526 | 0.4537 |
2.9751 | 115.0 | 205505 | 3.0479 | 0.4512 |
2.9749 | 116.0 | 207292 | 3.0545 | 0.4493 |
2.9735 | 117.0 | 209079 | 3.0529 | 0.4485 |
2.9705 | 118.0 | 210866 | 3.0080 | 0.4581 |
2.9698 | 119.0 | 212653 | 3.0271 | 0.4537 |
2.9674 | 120.0 | 214440 | 3.0477 | 0.4482 |
2.9666 | 121.0 | 216227 | 3.0328 | 0.4558 |
2.9664 | 122.0 | 218014 | 3.0689 | 0.4463 |
2.9639 | 123.0 | 219801 | 3.0749 | 0.4459 |
2.9633 | 124.0 | 221588 | 3.0505 | 0.4489 |
2.9618 | 125.0 | 223375 | 3.0256 | 0.4535 |
2.9589 | 126.0 | 225162 | 3.0522 | 0.4496 |
2.9584 | 127.0 | 226949 | 3.0451 | 0.4530 |
2.9589 | 128.0 | 228736 | 3.0654 | 0.4502 |
2.9581 | 129.0 | 230523 | 2.9989 | 0.4580 |
2.9554 | 130.0 | 232310 | 3.0347 | 0.4508 |
2.9565 | 131.0 | 234097 | 3.0586 | 0.4498 |
2.9548 | 132.0 | 235884 | 3.0170 | 0.4536 |
2.9515 | 133.0 | 237671 | 3.0470 | 0.4492 |
2.9499 | 134.0 | 239458 | 3.0339 | 0.4515 |
2.9514 | 135.0 | 241245 | 3.0474 | 0.4473 |
2.9486 | 136.0 | 243032 | 3.0427 | 0.4493 |
2.9483 | 137.0 | 244819 | 3.0336 | 0.4534 |
2.9491 | 138.0 | 246606 | 3.0274 | 0.4516 |
2.9465 | 139.0 | 248393 | 3.0354 | 0.4539 |
2.9447 | 140.0 | 250180 | 3.0139 | 0.4526 |
2.9449 | 141.0 | 251967 | 3.0163 | 0.4548 |
2.9439 | 142.0 | 253754 | 3.0308 | 0.4534 |
2.9435 | 143.0 | 255541 | 3.0242 | 0.4579 |
2.943 | 144.0 | 257328 | 3.0437 | 0.4513 |
2.943 | 145.0 | 259115 | 3.0227 | 0.4544 |
2.9403 | 146.0 | 260902 | 3.0464 | 0.4478 |
2.9407 | 147.0 | 262689 | 3.0718 | 0.4465 |
2.9397 | 148.0 | 264476 | 3.0519 | 0.4487 |
2.9392 | 149.0 | 266263 | 3.0163 | 0.4558 |
2.9377 | 150.0 | 268050 | 3.0159 | 0.4518 |
2.9386 | 151.0 | 269837 | 3.0010 | 0.4545 |
2.9391 | 152.0 | 271624 | 3.0346 | 0.4530 |
2.9364 | 153.0 | 273411 | 3.0039 | 0.4541 |
2.9359 | 154.0 | 275198 | 3.0417 | 0.4519 |
2.9359 | 155.0 | 276985 | 3.0161 | 0.4544 |
2.936 | 156.0 | 278772 | 3.0169 | 0.4534 |
2.9329 | 157.0 | 280559 | 3.0594 | 0.4478 |
2.9336 | 158.0 | 282346 | 3.0265 | 0.4555 |
2.9341 | 159.0 | 284133 | 3.0276 | 0.4542 |
2.933 | 160.0 | 285920 | 3.0324 | 0.4524 |
2.9325 | 161.0 | 287707 | 3.0249 | 0.4489 |
2.932 | 162.0 | 289494 | 3.0444 | 0.4519 |
2.9334 | 163.0 | 291281 | 3.0420 | 0.4494 |
2.9318 | 164.0 | 293068 | 2.9972 | 0.4541 |
2.9316 | 165.0 | 294855 | 2.9973 | 0.4526 |
2.9318 | 166.0 | 296642 | 3.0389 | 0.4529 |
2.9301 | 167.0 | 298429 | 3.0131 | 0.4557 |
2.9291 | 167.88 | 300000 | 3.0067 | 0.4548 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.14.0a0+410ce96
- Datasets 2.8.0
- Tokenizers 0.13.2