h1t
/

oms_b_openclip_15_21

Model card Files Files and versions Community

oms_b_openclip_15_21 / README.md

h1t's picture

h1t

Create README.md

65d3a5e 12 months ago

|

history blame contribute delete

3.03 kB

	---
	library_name: diffusers
	base_model: stabilityai/stable-diffusion-xl-base-1.0
	tags:
	- text-to-image
	license: openrail++
	inference: false
	---

	# One More Step

	One More Step (OMS) module was proposed in [One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls](https://github.com/mhh0318/OneMoreStep)
	by Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Tat-Jen Cham et al.


	By adding one small step on the top the sampling process, we can address the issues caused by the current schedule flaws of diffusion models without changing the original model parameters. This also allows for some control over low-frequency information, such as color.

	Our model is versatile and can be integrated into almost all widely-used Stable Diffusion frameworks. It's compatible with community favorites such as LoRA, ControlNet, Adapter, and foundational models.


	## Usage

	OMS now is supported 🤗 `diffusers` with a customized pipeline [github](https://github.com/mhh0318/OneMoreStep). To run the model (especially with `LCM` variant), first install the latest version of `diffusers` library as well as `accelerate` and `transformers`.

	```bash
	pip install --upgrade pip
	pip install --upgrade diffusers transformers accelerate
	```

	And then we clone the repo
	```bash
	git clone https://github.com/mhh0318/OneMoreStep.git
	cd OneMoreStep
	```


	### SD15 and SD21

	Due to differences in the VAE latent space between SD1.5/SD2.1 and SDXL, the OMS module for SD1.5/SD2.1 cannot be shared with SDXL, however, SD1.5/SD2.1 can share the same OMS module as well as with models like LCM that are based on SD1.5 or SD2.1. For more details, please refer to our paper.

	We have uploaded one OMS module for SD15/21 series at [h1t/oms_b_openclip_15_21](https://huggingface.co/h1t/oms_b_openclip_15_21), which has a base architecture, an OpenCLIP text encoder.

	We simply put a demo here:

	```python
	import torch
	from diffusers import StableDiffusionPipeline, LCMScheduler

	sd_pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16, variant="fp16", safety_checker=None).to('cuda')

	pipe = OMSPipeline.from_pretrained('h1t/oms_b_openclip_15_21', sd_pipeline = sd_pipe, torch_dtype=torch.float16, variant="fp16", trust_remote_code=True)
	pipe.to('cuda')

	generator = torch.Generator(device=pipe.device).manual_seed(100)

	prompt = "a starry night"

	image = pipe(prompt, guidance_scale=7.5, num_inference_steps=20, oms_guidance_scale=2., generator=generator)

	image['images'][0]
	```
	![oms_15](sd15_oms.png)

	and without OMS:

	```python
	image = pipe(prompt, guidance_scale=7.5, num_inference_steps=20, oms_guidance_scale=2., generator=generator, oms_flag=False)

	image['images'][0]
	```
	![oms_15](sd15_wo_oms.png)

	We found that the quality of the generative model has been greatly improved.
	For more models and more functions like diverse prompt, please refer to [OMS Repo](https://github.com/mhh0318/OneMoreStep).