How can I increase the length of the response?
I've tried using the min and max tokens but they don't appear to have any effect:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Load model
qa_pipeline = pipeline(task="question-answering",
model="deepset/roberta-base-squad2",
tokenizer="deepset/roberta-base-squad2",
device=device)
query = "What actions do we need to take?"
QA_Input = {
'question': query,
'context': system_prompt + "\n\n" + document
}
response = qa_pipeline(QA_Input,
min_length=512,
max_length=4096)
The largest response I've gotten:
{'score': 0.03653841093182564, 'start': 2187, 'end': 2227, 'answer': 'make sure the blower rotation is correct'}
For reference, the context has a series of actions to follow