Trained the eos_token into the lm_head.
This should allow qlora finetunes with 24 or even 16 GB of vram.
Model tree for gghfez/Qwen2.5-14B-Base-lm_head-LoRA
Base model
Qwen/Qwen2.5-14BTrained the eos_token into the lm_head.
This should allow qlora finetunes with 24 or even 16 GB of vram.
Base model
Qwen/Qwen2.5-14B