ImportError: Using `bitsandbytes` 4-bit quantization requires the latest version of bitsandbytes: `pip install -U bitsandbytes`

#2
by loong - opened

bitsandbytes 0.43.3

pip install git+https://github.com/huggingface/diffusers@c1d5b9669144f08e10d2abbc39cd8d198c6a64b9
this work,but 15g Vram is OutOfMemoryError: CUDA out of memory. Tried to allocate 80.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 53.06 MiB is free. Process 9501 has 14.69 GiB memory in use.

I request that you not post issues in multiple places. You have posted the same issue on GitHub, here, and also on my Twitter post. Let's try not to do that :)

but 15g Vram is OutOfMemoryError: CUDA out of memory. Tried to allocate 80.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 53.06 MiB is free. Process 9501 has 14.69 GiB memory in use.

For this, you should be able to load this checkpoint on a free-tier Google Colab:
https://colab.research.google.com/gist/sayakpaul/dbda0da4ec30fbab40cee3fc4ac35329/scratchpad.ipynb

But full inferencing requires additional tricks, which I will post later.

I request that you not post issues in multiple places. You have posted the same issue on GitHub, here, and also on my Twitter post. Let's try not to do that :)

but 15g Vram is OutOfMemoryError: CUDA out of memory. Tried to allocate 80.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 53.06 MiB is free. Process 9501 has 14.69 GiB memory in use.

For this, you should be able to load this checkpoint on a free-tier Google Colab:
https://colab.research.google.com/gist/sayakpaul/dbda0da4ec30fbab40cee3fc4ac35329/scratchpad.ipynb

But full inferencing requires additional tricks, which I will post later.

Tks,Look forward to

try explicit use of Accelerate offloading

This comment has been hidden

it run in colab t4

from huggingface_hub import snapshot_download

snapshot_download(repo_id="black-forest-labs/FLUX.1-schnell",
local_dir="/content/flux1schnell")

import matplotlib.pyplot as plt
import torch
from diffusers import FluxPipeline

here, adjust the path

pipe = FluxPipeline.from_pretrained("/content/flux1schnell", torch_dtype=torch.bfloat16)
#pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
pipe.enable_sequential_cpu_offload() # offloads modules to CPU on a submodule level (rather than model level)

prompt = "Ancient greek soldier with a sword and a shield. Behind there are horses. In the background there is a mountain with snow."

image = pipe(
prompt,
guidance_scale=0.0,
output_type="pil",
num_inference_steps=1,
max_sequence_length=256,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]

plt.imshow(image)
plt.show()
image.save("generated_imag.png")

cel1

from huggingface_hub import snapshot_download

snapshot_download(repo_id="black-forest-labs/FLUX.1-schnell",
local_dir="/content/flux1schnell")

cel2
import matplotlib.pyplot as plt
import torch
from diffusers import FluxPipeline

here, adjust the path
pipe = FluxPipeline.from_pretrained("/content/flux1schnell", torch_dtype=torch.bfloat16)
#pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
pipe.enable_sequential_cpu_offload() # offloads modules to CPU on a submodule level (rather than model level)

prompt = "Ancient greek soldier with a sword and a shield. Behind there are horses. In the background there is a mountain with snow."

image = pipe(
prompt,
guidance_scale=0.0,
output_type="pil",
num_inference_steps=1,
max_sequence_length=256,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]

plt.imshow(image)
plt.show()
image.save("generated_imag.png")

Sign up or log in to comment