t5_base_question_generation
This model is a fine-tuned version of t5-base on an SQUAD dataset for QA.
Model description
More information needed
Intended uses
The model takes context as an input sequence, and will generate a full question sentence as an output sequence. The max sequence length is 512 tokens. Inputs should be organised into the following format: <generate_questions> paragraph: context text here'
The input sequence can then be encoded and passed as the input_ids argument in the model's generate() method.
limitations
The model was trained on only a limited amount of data hence questions might be poor quality. In addition the questions generated have style similar to that of the training data.
Training and evaluation data
The model takes as input a passage to generate questions answerable by the passage. The dataset used to train the model comprises 80k passage-question pairs sampled randomly from the SQUAD training data. For the evaluation we sampled 10k passage-question pairs from the SQUAD development set.
Training procedure
The model was trained for 5 epochs over the training set with a learning rate of 5e-05 with EarlyStopping. The batch size was only 10 due to GPU memory limitations
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.21
- num_epochs: 5
Framework versions
- Transformers 4.23.1
- Pytorch 1.13.0
- Datasets 2.6.1
- Tokenizers 0.13.1
- Downloads last month
- 14