The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Abstract
Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, these models struggle with generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach. Project page is available at https://omriavrahami.com/the-chosen-one
Community
Code?
I've had people contact me through Freelancer to do this. Whoever figures this out is going to be rich. For instance, i want a dress on a model, the exact same dress, with different poses. A magazine generator is what she wanted. I told her she would be better served by a photographer. But what if you had a magic wand tool to exclude certain pixels from being manipulated in subsequent generations. A way to keep elements dialed in. Nevermind rich, this is the crux of inventing a whole new angle of generative imaging, a big milestone.
What if those pixels could not only be made static, but in a dynamic way so that they stay the same for design purposes but can be manipulated internally for scene continuity.
Life Story is hilarious btw
github?
Code?
github?
I would assume someone intends to patent this work.
How to use this model?
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper