Trained the eos_token into the lm_head.This should allow qlora finetunes with 24 or even 16 GB of vram.