CUDA extension not installed
Thanks for your work.
When I am running
tokenizer = AutoTokenizer.from_pretrained(local_dir, use_fast=False)
model = AutoGPTQForCausalLM.from_quantized(local_dir, device="cuda:0", use_triton=False, use_safetensors=True, torch_dtype=torch.float32, trust_remote_code=True)
I get these warnings:
CUDA extension not installed.
RWGPTQForCausalLM hasn't fused attention module yet, will skip inject fused attention.
RWGPTQForCausalLM hasn't fused mlp module yet, will skip inject fused mlp.
I'm wondering if CUDA extension not installed
affects model performance. I can't figure out if it uses my GPU. It seems that I see a load on 6gb vram, but I don’t see PID of the task that would work during inference. Maybe it works on the CPU? At times, inference can take a very long time.
My env:
Collecting environment information...
PyTorch version: 2.0.1+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.26.3
Libc version: glibc-2.31
Python version: 3.10.11 (main, Apr 20 2023, 19:02:41) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-67-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 11.2.152
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1080 Ti
Nvidia driver version: 515.86.01
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.1.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.1.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.1.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.1.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.1.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.1.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.1.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Model name: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
Versions of relevant libraries:
[pip3] numpy==1.24.3
[pip3] torch==2.0.1
[conda] numpy 1.24.3 pypi_0 pypi
[conda] torch 2.0.1 pypi_0 pypi
Did you build Autogptq with CUDA locally?
Did you build Autogptq with CUDA locally?
Yes, I did that on my local machine:
git clone https://github.com/PanQiWei/AutoGPTQ
cd AutoGPTQ
pip install .
pip install einops
Did you build Autogptq with CUDA locally?
Yes, I did that on my local machine:
git clone https://github.com/PanQiWei/AutoGPTQ cd AutoGPTQ pip install . pip install einops
I have this errors :
(localGPT) [dg@localhost AutoGPTQ]$ pip install .
Processing /home/dg-linc/AutoGPTQ
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/home/dg-linc/AutoGPTQ/setup.py", line 58, in
CUDA_VERSION = "".join(os.environ.get("CUDA_VERSION", default_cuda_version).split("."))
AttributeError: 'NoneType' object has no attribute 'split'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
someone have an idea ?