roberta-large-squad_epoch_3

Model description

This is a fine-tuned version of DistilBERT for question answering tasks. The model was trained on SQuAD dataset.

Training procedure

The model was trained with the following hyperparameters:

Learning Rate: 5e-05
Batch Size: 8
Epochs: 3
Weight Decay: 0.01

Intended uses & limitations

This model is intended to be used for question answering tasks, particularly on SQuAD-like datasets. It performs best on factual questions where the answer can be found as a span of text within the given context.

Training Details

Training Data

The model was trained on the SQuAD dataset, which consists of questions posed by crowdworkers on a set of Wikipedia articles.

Training Hyperparameters

The model was trained with the following hyperparameters:

learning_rate: 5e-05
batch_size: 8
num_epochs: 3
weight_decay: 0.01

Uses

This model can be used for:

Extracting answers from text passages given questions
Question answering tasks
Reading comprehension tasks

Limitations

The model can only extract answers that are directly present in the given context
Performance may vary on out-of-domain texts
The model may struggle with complex reasoning questions

Additional Information

Model type: DistilBERT
Language: English
License: MIT
Framework: PyTorch

clementlemon02
/

roberta-large-squad_epoch_3