Some inference engines only take the EOS token from generation_config.json

This should also include the chatML token 32000 to avoid looping issues

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment