GGUF Checkpoint

Distribute an UNet/Transformer module in Q8 file format, but run the model in regular fp16.

It should work on any checkpoint that is supported by the LoadCheckpoint node.

CLIP, T5, VAE are excluded, since people do not retrain them often. This release should help ease our bandwidth constraints, if people are willing to embrace the gguf format.

It does only support Q8_0 and Q4_0. The uploaded custom node (as for now) does not lower the VRAM requirements.

Min ComfyUI version: v0.0.8. There are no extra dependencies, tested on a fresh install.

Gerganov's gguf scripts have been updated, this is just the result. Fixed a numpy 2.0 error in GGUFReader.

The model weights belong to SAI.

Installation

I have uploaded an sd3_models folder for convenience, its content goes to the models folder.

Download a gguf file and move it to the models/unet folder.
Download and move the ComfyUI-Unet-Gguf to the custom_nodes folder.
Drag and drop workflow_gguf.json to your workflow.

Disclaimer

Please, do not reupload the node, instead link to the huggingface repo.

Use of this code and the copy of documentation requires citation and attribution to the author via a link to their Hugging Face profile in all resulting work.