Nymbo/FLUX.1-Dev-Serverless · Setting sigmas to 0.95?

Oct 23

•

jonesaid at reddit found a way to unleash the powers of FLUX.1-Dev by reducing sigmas to 0.95:

https://www.reddit.com/r/comfyui/comments/1g9wfbq/simple_way_to_increase_detail_in_flux_and_remove/

Apparently, the schedulers were removing too much noise at every step of generating the image and this is causing all the problems we see on Flux, so reducing sigmas 0.95 makes the outputs looks like some sort of fantasied Flux 2! Is there a way to make something like that work over here?

Yntec changed discussion title from Setting signmas to 0.95? to Setting sigmas to 0.95? Oct 23

Nymbo

Owner 4 days ago

•

edited 4 days ago

I've scoured the docs and I can't figure out a way to make this happen outside of ComfyUI lol. The serverless inference docs are still lacking in some areas and I haven't come across a working example on the hub... I imagine it's possible on a pipeline. It looks great though, have you got it running on any space?

P.S. Sorry for the late response, I suck at responding to threads!

Nymbo changed discussion status to closed 4 days ago

Nymbo changed discussion status to open 4 days ago

Yntec

4 days ago

Thanks for looking into this.

have you got it running on any space?

No, not even ZeroGPU spaces have done it! Perhaps it's impossible!

Anyway things have changed since then with the release of Stable Diffusion 3.5 Medium, because it got this level of detail and compositions, so we just need SDXL's refiner running on the inference API... The idea would be to let SD3.5 Medium run the prompt for some steps, and then switch to the refiner that runs Flux.1 Dev to finish the picture, with some sort of adapter similar to the one used by Loras, and then it should be able to even look better than this! Of course currently this is even more impossible, but a man can dream! Imagine with this tech we would be able to run a model for some steps and then allow another one to finish the pic so we could have another model fix any issues and refine the details of another! This sounds incredible in my head.

Let me, huh... summon @john6666 into the thread, last time I did so we got seeds, who knows what we can get now, because the models aren't using more steps or resources, as long as they load on the serverless API the first model will run for less time and the second for less steps so if the pipeline could support this we'd get infinite possibilities for free! I don't know who made the Lora adapter but this looks like the next... step, heh.

John6666

4 days ago

•

edited 4 days ago

I've been summoned.😎
Well, if the goal is to use it with Diffusers, I think Diffusers itself needs to be compatible. Otherwise, you'd need a long source code to call a single model...
On the other hand, the pipeline source code is public, so in theory it should be possible to manually rewrite and replace it. In theory. Nyanko7 and others do it sometimes.
Since sigma changes are beneficial and inexpensive, I think it would be quicker to raise an issue on github, but I'm a github beginner too.

Edit:
Already there?
https://github.com/huggingface/diffusers/issues/9924
https://github.com/huggingface/diffusers/issues/9971
If we wait, it will be possible automatically with a version upgrade.
I think these schedulers could be specified with sigma.

Edit:
Oh, maybe not if I read the manual again?
It's a bit iffy whether or not it can be specified.
Maybe we need a bit more detail about the ComfyUI sampler and scheduler that was actually used in the Reddit post.

Yntec

4 days ago

Wow! A lot of thing are happening behind the scenes! So if I'm understanding correctly, after these changes are implemented we'd need to clone the Flux Dev.1 repo and change their scheduler's file to the one that can reduce sigmas and set it to 0.95 on there and then the serverless API of our clone would work like that? If so, that feels like a classic, I remember digiplay would make identical copies of repos but with schedulers changed to give different results, but this time we could have a model with better outputs than Black Forest Labs's original repo for Flux!

John6666

4 days ago

That's right. If it's implemented in Diffusers, Python writers can change the behavior of the model by simply passing options to the scheduler class instance on initialization, and people uploading models can change the behavior of the model by editing scheduler_config.json in Notepad.
The only problem is that we still need to check whether the ongoing changes described above will produce the same results as those on Reddit. If the scheduler algorithm is completely different, we may need to add a scheduler itself. That would be a big job for the Diffusers team, mainly in terms of debugging and testing the operation.
I think there is probably something similar...

FashionStash

4 days ago

Perhaps it's easy to get the information.. But, I'm feeling curious, as Flux 1 Dev is the subject of all things (haven't yet used, I'm not sure to test it by the way), but I saw Sigma? What are sigma? Thanks for the answer by advance!

Yntec

4 days ago

@FashionStash Hey! You can test it here: https://huggingface.co/spaces/black-forest-labs/FLUX.1-dev - since you are used to creative and artful models it's probably going to be a disappointment, so INSTEAD of using your prompts, I recommend searching for an image in Google Images, or using one that you already have, and uploading it here: https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-one - after running that you will get a very long prompt that is paragraphs long, you copy and paste that into Flux, then you will see why it's the most advanced model capable of drawing pretty much anything as long as you specify it to the prompt in a way it understands, like, in a SD1.5 saying someone is "mean" will get you the expression you want, here you'd need to write a paragraph detailing how the eyebrows and eyes and mouth should look like to get it, but then you can make advanced facial expressions unavailable on previous models.

Sigmas are how much noise is removed from the picture at each step, the reason Flux's outputs aren't very detailed and they often have a blurry background is because sigmas are removing way too much noise, so just removing 95% of what it removes normally keeps detail in and is able to produce much crisper backgrounds. There's a GUI called ComfyUI that people with the hardware use to run Flux and people have created addons for it, including ones to modify CFG so it follows your prompts better (unlocking many styles that it has that will not be given normally) or reduce sigmas, but we currently have no way to do that in the diffusers version that huggingface uses for inference, so without the hardware we are out of luck.

What I realized yesterday is that we are at the bleeding edge of this and people are implementing features in real time as we speak, so we may just need to be patient, though, a bit of a problem is we can't talk with the people doing it directly because we don't have github accounts, and they prefer to use it to discuss instead of huggingface, so, heh, we may need to send a spy over there to inquire about all this, it seems it'd be easier to design a working rocket!

John6666

4 days ago

Things are changing too fast...

news from discord about flux:

sayakpaul — 今日 18:46
Apart from Control LoRA Flux (which we're actively working on), we have shipped support for the new Flux tools. 

Docs: https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux
NF4 checkpoints: https://huggingface.co/collections/sayakpaul/flux-tools-in-nf4-6742ede9e736bfbb3ab4f4e1

Make sure you install diffusers from source and pass the right revision while loading the checkpoints as the original repos are yet to merge diffusers weights.

https://discord.com/channels/879548962464493619/1014557141132132392/1310179757228294215

John6666

4 days ago

•

edited 4 days ago

YnTec, I remember that you might have a trauma with 2FA on github, but would you guys like to join github or Discord?
From what I can see, there are a lot of coders and staff, but there is a lack of feedback and mutual support from the perspective of users and artists.
So, library developers don't notice bugs and inconveniences that are important for practical use. This is a level that even coders are complaining about. (in real, mainly for Gradio)

Yntec

4 days ago

I'm willing to join github if that would solve things, surely, though, I wouldn't know what to say, or where to say it, I'd not have said anything else if if wasn't for Nymbo's reply from yesterday! My problem is not knowing what to do, that's why back when I had a fully working github account I didn't do much with it, other than starting projects I never finished that now I feel shame about. I don't even check huggingface forums! Maybe you asked me a question last time we talked and I never went to check that thread again, I engage on here because I get a golden circle around my avatar when there's a notification, it's like different countries we have to visit and I have only found toxic communities at discord where they ban you if you make any comment about having some phobia, so I'd rather have someone else extract the information from there.

Nymbo

Owner 3 days ago

I love the idea of having a Discord, considering the fractured nature of communicating on HF. I just threw this server together now, it's called SupaHuggers (name subject to change, I'm not married to it) - https://discord.gg/E59k3gkZyd

I'm also in the official HF discord but the idea with SupaHuggers would be a small server for a few very regular users to collab, vent about gradio 5.0, etc.

I'd love you guys to join to get it started @John6666 @Yntec

Yntec

3 days ago

•

edited 3 days ago

Huh, so now I'm glad I wrote a book about why I'm against discord in principle so I don't have to repeat myself: https://huggingface.co/Yntec/Dreamlike/discussions/3#6614ee5e06ee61ff24dacddb - but for short, I deem it as "closed source", keeping things private and unsearchable as a secret so only "members of the club" can benefit, just like keeping the recipe of a model unpublished or posting an AI generated image without the prompt, I want all the discussions I'm involved with to remain open and accessible to everyone and it makes me glad there's no private message system over here, so you can check every single thing I've told to john6666 and he has told me for instance and nothing is concealed (I don't use emails either!)

I guess publishing monthly publicly all the messages from the discord would solve it, then again, when it comes to it I turn out to be a difficult person to deal with, heh, but imagine Black Forest Labs was like this and we could see all their conversations about how they made Flux, everyone could benefit and we could create a new version, but Flux Pro was never published, akin to being buried in a discord server.

John6666

3 days ago

I see. You don't like the way Discord is.
I also think that the fact that you can't search for Discord on Google makes it particularly difficult to use from an OSS perspective. On the other hand, it's easy to have private conversations on it, so it's somewhere between anonymous message boards and HF, leaning towards HF...
For example, if you discuss solutions here, on the HF forum, on github, or on anonymous message boards, then that becomes a resource that you can search for on Google in the same way as StackOverflow, but that's not the case with Discord.

I'm not familiar with the github culture either. I think it's been less than a month...
However, if you think there's a bug or something inconvenient, you can just write an issue, and if it can be fixed, you can just PR it, I think. I still don't really understand how to use Discussion.
Well, if we consult on HF and figure out where the problem is, wouldn't it be fine if one of us just writes an issue?