THUDM/chatglm-6b · Problem: query_key_layer_scaling_coeff = float(layer

Hello, thank you for your work!!
I've got a problem when I run ChatGLM with LoRa:

File "/home//Hongwei/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/8b7d33596d18c5e83e2da052d05ca4db02e60620/modeling_chatglm.py", line 267, in attention_fn
query_key_layer_scaling_coeff = float(layer_id + 1)
RuntimeError: CUDA error: no kernel image is available for execution on the device

I found this is because tensor variable (layer_id) is added by non-tensor (1). So I change layer_id to .cpu().numpy(). But I DO not know why layer_id is a tensor? Is there something wrong in the code?

THUDM
/

chatglm-6b

Problem: query_key_layer_scaling_coeff = float(layer_id + 1)