using stable-diffusion-2 for img2img with prompt
Hey guys!
I've been trying to load the stable-diffusion-2 model in the img2img pipeline,
I've tried:
model_path = "stabilityai/stable-diffusion-2"
pipe_img = StableDiffusionImg2ImgPipeline.from_pretrained(
model_path,
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
)
but this give the errors:
ValueError: Pipeline <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_img2img.StableDiffusionImg2ImgPipeline'> expected {'unet', 'scheduler', 'feature_extractor', 'vae', 'safety_checker', 'text_encoder', 'tokenizer'}, but only {'unet', 'scheduler', 'vae', 'text_encoder', 'tokenizer'} were passed.
which makes me believe there is another model for this, if available?
I too am having this issue.
Where did you even figure out how to get that far?
By doing something similar to the reddit post above I am able to run it without errors but something is not set up right the images are randomly altered of the original as to be unrecognizable.
pipestart = StableDiffusionImg2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-2",
revision="fp16",
torch_dtype=torch.float16,
)
pipe = StableDiffusionImg2ImgPipeline(**pipestart.components).to("cuda")
@juggernaught
I think you need to import StableDiffusionPipeline
first, where you use pretrained
https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/overview#how-to-convert-all-use-cases-with-multiple-or-single-pipeline
It is further up in the code. I think this is all of it. Uplaod test1.jpg into root.
!pip install -Uq diffusers transformers fastcore
import logging
from pathlib import Path
import matplotlib.pyplot as plt
import torch
from diffusers import StableDiffusionPipeline
from fastcore.all import concat
from huggingface_hub import notebook_login
from PIL import Image
logging.disable(logging.WARNING)
torch.manual_seed(1)
if not (Path.home()/'.huggingface'/'token').exists(): notebook_login()
from diffusers import StableDiffusionImg2ImgPipeline
from fastdownload import FastDownload
def image_grid(imgs, rows, cols):
w,h = imgs[0].size
grid = Image.new('RGB', size=(cols*w, rows*h))
for i, img in enumerate(imgs): grid.paste(img, box=(i%cols*w, i//cols*h))
return grid
pipestart = StableDiffusionImg2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-2",
revision="fp16",
torch_dtype=torch.float16,
).to("cuda") #making this have .to("cuda") also made no difference.
pipe = StableDiffusionImg2ImgPipeline(**pipestart.components).to("cuda")
init_image = Image.open("test1.jpg").convert("RGB")
torch.manual_seed(1000)
prompt = "a lovely sunset"
images = pipe(prompt=prompt, num_images_per_prompt=3, init_image=init_image).images
image_grid(images, rows=1, cols=3)
Ok. I mean, not use StableDiffusionImg2ImgPipeline.from_pretrained
but StableDiffusionPipeline.from_pretrained
pipestart = StableDiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-2",
revision="fp16",
torch_dtype=torch.float16,
).to("cuda") #making this have .to("cuda") also made no difference.
pipe = StableDiffusionImg2ImgPipeline(**pipestart.components).to("cuda")
or try to use stable diffusion mega: https://github.com/huggingface/diffusers/tree/main/examples/community#stable-diffusion-mega
@curibe it made no difference.