Edit model card

roberta-large-squad_epoch_3

Model description

This is a fine-tuned version of DistilBERT for question answering tasks. The model was trained on SQuAD dataset.

Training procedure

The model was trained with the following hyperparameters:

  • Learning Rate: 5e-05
  • Batch Size: 8
  • Epochs: 3
  • Weight Decay: 0.01

Intended uses & limitations

This model is intended to be used for question answering tasks, particularly on SQuAD-like datasets. It performs best on factual questions where the answer can be found as a span of text within the given context.

Training Details

Training Data

The model was trained on the SQuAD dataset, which consists of questions posed by crowdworkers on a set of Wikipedia articles.

Training Hyperparameters

The model was trained with the following hyperparameters:

  • learning_rate: 5e-05
  • batch_size: 8
  • num_epochs: 3
  • weight_decay: 0.01

Uses

This model can be used for:

  • Extracting answers from text passages given questions
  • Question answering tasks
  • Reading comprehension tasks

Limitations

  • The model can only extract answers that are directly present in the given context
  • Performance may vary on out-of-domain texts
  • The model may struggle with complex reasoning questions

Additional Information

  • Model type: DistilBERT
  • Language: English
  • License: MIT
  • Framework: PyTorch
Downloads last month
31
Safetensors
Model size
354M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Dataset used to train clementlemon02/roberta-large-squad_epoch_3

Evaluation results