Spaces:
Running
Running
Converting LLaMa 2 bin files to safetensors changes the output
#19
by
milad-a
- opened
I fine-tuned LLaMa 2 to test its query classification quality and after I saved my final model, I converted the pytorch bin
files to safetensors
using this file and used it in TGI. But I noticed I am getting completely different results compared to just using AutoModelForCausalLM.from_pretrained()
and model.generate()
. Note that all other parameters like top_p, top_k, etc. are the same and temperature is set to a small positive value of 0.01. After a lot of testing, I am confident that safetensors
are the only variable between the two.
Is this a known fact or a bug in conversion?