Repetitive generation without additional EOS token
Hi! The generation_config supplied will generate indefinitely in a chat setting and repeat itself, because '<|end_of_text|>' is rarely generated. It should work better if <|eot_id|> is added, which is generated at the end of every chat response.
Here's the config that meta supplies to cover both cases:
{
"bos_token_id": 128000,
"eos_token_id": [128001, 128009],
"do_sample": true,
"temperature": 0.6,
"max_length": 4096,
"top_p": 0.9,
"transformers_version": "4.40.0.dev0"
}
Isn't <|eot_id|>
set as eos_token
already? Look here.
My understanding (guess work, haven't looked at the code/documentation) is that the generation config separately specifies the eos token so it knows when to stop generation. And in the generation_config for this model, it's specified as 128001, which is never really generated. Tokenizer has the real EOS token so it knows what to append to a tokenized sequence, but generation needs to have the more "intermediate" stop token to indicate the end of a particular response (but not necessarily the end of the whole conversation).