device error when using 76B.

#9
by puar-playground - opened

I tried to use the 76B model for inference on single image. And I followed the instruction about the customized device_map function:

device_map = split_model('InternVL2-Llama3-76B')

Then, I got this error. The same code is ok for the 8B model. Not sure the reason for 76B.

model_answer, history = self.model.chat(self.tokenizer, pixel_values, question, generation_config,
File "/home/user/.cache/huggingface/modules/transformers_modules/OpenGVLab/InternVL2-Llama3-76B/cf7914905f78e9e3560ddbd6f5dfc39becac494f/modeling_internvl_chat.py", line 282, in chat
generation_output = self.generate(
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/user/.cache/huggingface/modules/transformers_modules/OpenGVLab/InternVL2-Llama3-76B/cf7914905f78e9e3560ddbd6f5dfc39becac494f/modeling_internvl_chat.py", line 332, in generate
outputs = self.language_model.generate(
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2024, in generate
result = self._sample(
File "/opt/conda/envs/local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2982, in _sample
outputs = self(**model_inputs, return_dict=True)
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1189, in forward
outputs = self.model(
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 977, in forward
position_embeddings = self.rotary_emb(hidden_states, position_ids)
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/envs/local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 209, in forward
freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)

OpenGVLab org

Hello, thank you for your feedback. Could you tell me what version of your transformers is.

I was using 4.44.0 and was getting the same error. 4.37.2 downgrade solves the issue :)

OpenGVLab org

ok, thanks for your feedback~

I can confirm i can reproduce the error and the downgrade solves the problem

OpenGVLab org

Hello, thank you for your feedback. There are indeed some compatibility issues when using version 4.44.

czczup changed discussion status to closed

Sign up or log in to comment