ImportError: Using `load_in_8bit=True` requires Accelerate
#78
by
aimananees
- opened
Hi, I'm trying to load a model using quantization but it constantly errors out. I have pip installed all the needed libraries which includes accelerate
and bitsandbytes
. I have tried multiple times but no luck. Is anyone facing the same issue?
Here's my code snippet:
import torch
from transformers import BitsAndBytesConfig
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)
model_id = "tiiuae/falcon-7b"
model_4bit = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
quantization_config=quantization_config,
)
Yeah, I am also facing same issue. Tried all options available on HF, Stackoverflow ,etc
Use previous version of transformers.
!pip install transformers==4.30
If you are in a notebook, restarting the session worked for me. See below.
https://github.com/huggingface/transformers/issues/23323#issuecomment-1568464656