Papers
arxiv:2311.10093

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Published on Nov 16, 2023
Β· Submitted by akhaliq on Nov 17, 2023
#1 Paper of the day

Abstract

Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, these models struggle with generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach. Project page is available at https://omriavrahami.com/the-chosen-one

Community

Code?

I've had people contact me through Freelancer to do this. Whoever figures this out is going to be rich. For instance, i want a dress on a model, the exact same dress, with different poses. A magazine generator is what she wanted. I told her she would be better served by a photographer. But what if you had a magic wand tool to exclude certain pixels from being manipulated in subsequent generations. A way to keep elements dialed in. Nevermind rich, this is the crux of inventing a whole new angle of generative imaging, a big milestone.
What if those pixels could not only be made static, but in a dynamic way so that they stay the same for design purposes but can be manipulated internally for scene continuity.
Life Story is hilarious btw

github?

Code?

github?

I would assume someone intends to patent this work.

This comment has been hidden

How to use this model?

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2311.10093 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2311.10093 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2311.10093 in a Space README.md to link it from this page.

Collections including this paper 32