error when doing inference
Here is my code:
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("/mnt/e/llm/baichuan", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("/mnt/e/llm/baichuan", device_map="auto", trust_remote_code=True)
model.tie_weights()
inputs = tokenizer('Hamlet->Shakespeare\nOne Hundred Years of Solitude->', return_tensors='pt')
#inputs = inputs.to('cuda:0')
pred = model.generate(**inputs, max_new_tokens=64,repetition_penalty=1.1)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
error is:
The model weights are not tied. Please use the tie_weights
method before using the infer_auto_device
function.
ValueError: The current device_map
had weights offloaded to the disk. Please provide an offload_folder
for them. Alternatively, make sure you have safetensors
installed if the
model you are using offers the weights in this format.
mac m1 have the same problem,maybe don't have cuda(NVIDIA GPUs)......