Edit model card

distilbert-base-uncased-distilled-squad_qa_model

This model is a fine-tuned version of distilbert-base-uncased-distilled-squad on the subjqa dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9380

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-07
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
4.1556 1.0 32 4.1242
4.0411 2.0 64 4.0582
3.9828 3.0 96 3.9948
3.9068 4.0 128 3.9378
3.8152 5.0 160 3.8835
3.7906 6.0 192 3.8329
3.7543 7.0 224 3.7842
3.7173 8.0 256 3.7377
3.6717 9.0 288 3.6958
3.6219 10.0 320 3.6559
3.587 11.0 352 3.6185
3.6111 12.0 384 3.5808
3.5374 13.0 416 3.5483
3.4506 14.0 448 3.5175
3.4286 15.0 480 3.4873
3.4021 16.0 512 3.4596
3.432 17.0 544 3.4328
3.3235 18.0 576 3.4079
3.3627 19.0 608 3.3841
3.323 20.0 640 3.3615
3.3127 21.0 672 3.3389
3.2635 22.0 704 3.3199
3.2542 23.0 736 3.3013
3.2302 24.0 768 3.2846
3.1699 25.0 800 3.2676
3.2333 26.0 832 3.2516
3.2204 27.0 864 3.2364
3.1809 28.0 896 3.2218
3.1739 29.0 928 3.2082
3.1966 30.0 960 3.1950
3.1513 31.0 992 3.1826
3.135 32.0 1024 3.1713
3.1253 33.0 1056 3.1599
3.0768 34.0 1088 3.1498
3.1031 35.0 1120 3.1394
3.064 36.0 1152 3.1293
3.0391 37.0 1184 3.1200
3.0701 38.0 1216 3.1117
3.0787 39.0 1248 3.1032
3.0423 40.0 1280 3.0956
3.0214 41.0 1312 3.0875
3.0289 42.0 1344 3.0804
2.9667 43.0 1376 3.0736
3.0341 44.0 1408 3.0671
3.0098 45.0 1440 3.0606
3.0202 46.0 1472 3.0544
2.9598 47.0 1504 3.0490
2.9734 48.0 1536 3.0430
2.9381 49.0 1568 3.0375
2.9444 50.0 1600 3.0328
2.9357 51.0 1632 3.0280
2.9453 52.0 1664 3.0237
2.9906 53.0 1696 3.0191
2.934 54.0 1728 3.0148
2.9076 55.0 1760 3.0110
2.9874 56.0 1792 3.0070
2.9682 57.0 1824 3.0032
2.9287 58.0 1856 2.9994
2.9575 59.0 1888 2.9956
2.8618 60.0 1920 2.9926
2.9614 61.0 1952 2.9893
2.9463 62.0 1984 2.9861
2.8927 63.0 2016 2.9834
2.9048 64.0 2048 2.9805
2.9161 65.0 2080 2.9777
2.9117 66.0 2112 2.9753
2.932 67.0 2144 2.9729
2.9148 68.0 2176 2.9706
2.8919 69.0 2208 2.9683
2.9278 70.0 2240 2.9662
2.869 71.0 2272 2.9643
2.8844 72.0 2304 2.9622
2.8636 73.0 2336 2.9603
2.8734 74.0 2368 2.9585
2.8934 75.0 2400 2.9569
2.86 76.0 2432 2.9551
2.8366 77.0 2464 2.9539
2.8887 78.0 2496 2.9522
2.8632 79.0 2528 2.9511
2.8691 80.0 2560 2.9496
2.8597 81.0 2592 2.9484
2.8775 82.0 2624 2.9473
2.8491 83.0 2656 2.9461
2.8639 84.0 2688 2.9450
2.8659 85.0 2720 2.9443
2.8557 86.0 2752 2.9433
2.8188 87.0 2784 2.9423
2.8896 88.0 2816 2.9416
2.8102 89.0 2848 2.9409
2.8452 90.0 2880 2.9403
2.8437 91.0 2912 2.9399
2.8193 92.0 2944 2.9397
2.8645 93.0 2976 2.9391
2.8745 94.0 3008 2.9388
2.8568 95.0 3040 2.9385
2.8832 96.0 3072 2.9382
2.8801 97.0 3104 2.9382
2.8488 98.0 3136 2.9383
2.8233 99.0 3168 2.9380
2.8505 100.0 3200 2.9380

Framework versions

  • Transformers 4.28.0
  • Pytorch 1.13.0a0+d321be6
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Chetna19/distilbert-base-uncased-distilled-squad_qa_model