Missing pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack
OSError: astronomer-io/Llama-3-8B-Instruct-GPTQ-8-Bit does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.
Any idea what's this?
How are you loading the model? Can you provide reproducible steps or code you used so I can help debug this?
The model weights are produced in safetensors. The older pytorch_model.bin I can produce but it is not recommended by the industry.
This is the explanation from official HuggingFace website on why .safetensors is better than the pickled bin files.
What is safetensors ?
safetensors is a different format from the classic .bin which uses Pytorch which uses pickle. It contains the exact same data, which is just the model weights (or tensors).
Pickle is notoriously unsafe which allow any malicious file to execute arbitrary code. The hub itself tries to prevent issues from it, but it’s not a silver bullet.
safetensors first and foremost goal is to make loading machine learning models safe in the sense that no takeover of your computer can be done.
If you are using text-generation-webui, please check the model card readme file on how to load and use it correctly.
Please note: through my testing TGI and vLLM has best throughput and token generation speed. I would recommend vLLM if your hardware is able to run it.
Hey
@RainmakerP
, is your issue resolved? I renamed the .safetensors
file to model.safetensors
. This should fix the issue where some frameworks cannot find the model file for loading. Please let me know if this fixes your issue.
Side note: I highly recommend serving this model using vLLM as it is the most stable framework for Llam 3 GPTQ quants in my testing.