Fix: add <eos> token at the end of chat template
#12
by
adhi29
- opened
When we try to generate training data for Instruction Fine-Tuning for the code gemma instruct model using the "apply_chat_template" function, the jinja template doesn't add token at the end of generation. This results in the model not-learning/unlearning the concept of end_of_sentence tokens.
The ideal behavior of chat template is to generate templates for training if "add_generation_prompt" = False and "continue_final_message" = False. But that's not the case here. This new tokenizer_config.json file fixes that problem.
adhi29
changed pull request title from
Upload tokenizer_config.json
to Fix: add <eos> token at the end of chat template