Error loading model
#1
by
cyrusxc5
- opened
Got this error when trying to load model:
[Stack<Integer> stack = new Stack<>();](RuntimeError: Error(s) in loading state_dict for GemmaForCausalLM:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([256002, 2048]) from checkpoint, the shape in current model is torch.Size([256000, 2048]).
size mismatch for lm_head.weight: copying a param with shape torch.Size([256002, 2048]) from checkpoint, the shape in current model is torch.Size([256000, 2048]).)
I use this piece of code:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from accelerate import Accelerator
device_index = Accelerator().process_index
device_map = {"": device_index}
torch.cuda.empty_cache()
hf.login(token="xxxxxxxxx",add_to_git_credential=False)
model_id = "windmaple/gemma-chinese"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map=device_map)
do you know how to fix this? or any suggestions on how to load the model properly? Thanks a lot
I have no issue loading the model.
https://colab.research.google.com/drive/1sm7aQ8v70Sx823SRwJRSpDzD6e7EV1Vn#scrollTo=lXdQJIUCMt3v
Your cache might be corrupt. If you are using your own computer, try to clean up HF cache.