386 6 2

Furkan Gözükara

MonsterMMORPG

https://www.youtube.com/@SECourses

AI & ML interests

Check out my youtube page SECourses for Stable Diffusion tutorials. They will help you tremendously in every topic

Articles

Full Training Tutorial and Guide and Research For a FLUX Style

Sep 8

• 5

20 New SDXL Fine Tuning Tests and Their Results (Better Workflow Obtained and Published)

Aug 13

• 1

Batch size 30 AdamW vs Batch Size 1 Adafactor SDXL Training Comparison

Aug 8

• 2

Expert-Level Tutorials on Stable Diffusion & SDXL: Master Advanced Techniques and Strategies

Jun 3

• 3

Organizations

Posts 44

Post

4228

Hunyuan3D-1 - SOTA Open Source Text-to-3D and Image-to-3D - 1-Click Install and use both Locally on Windows and on Cloud - RunPod and Massed Compute

Automatic Installers
Works amazing on 24 GB GPUs
Files > https://www.patreon.com/posts/115412205

So what is Hunyuan3D-1
Official repo : https://github.com/tencent/Hunyuan3D-1
On Hugging Face : tencent/Hunyuan3D-1

Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation

Abstract

While 3D generative models have greatly improved artists' workflows, the existing diffusion models for 3D generation suffer from slow generation and poor generalization. To address this issue, we propose a two-stage approach named Hunyuan3D-1.0 including a lite version and a standard version, that both support text- and image-conditioned generation.

In the first stage, we employ a multi-view diffusion model that efficiently generates multi-view RGB in approximately 4 seconds. These multi-view images capture rich details of the 3D asset from different viewpoints, relaxing the tasks from single-view to multi-view reconstruction. In the second stage, we introduce a feed-forward reconstruction model that rapidly and faithfully reconstructs the 3D asset given the generated multi-view images in approximately 7 seconds. The reconstruction network learns to handle noises and in-consistency introduced by the multi-view diffusion and leverages the available information from the condition image to efficiently recover the 3D structure.

Our framework involves the text-to-image model, i.e., Hunyuan-DiT, making it a unified framework to support both text- and image-conditioned 3D generation. Our standard version has 3x more parameters than our lite and other existing model. Our Hunyuan3D-1.0 achieves an impressive balance between speed and quality, significantly reducing generation time while maintaining the quality and diversity of the produced assets.

Post

2305

OmniGen 1-Click Automatic Installers for Windows, RunPod and Massed Compute

OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It is designed to be simple, flexible, and easy to use

Installers are here : https://www.patreon.com/posts/omnigen-1-click-115233922

Look attached images to understand what capabilities it has. It is simply amazing so many features.

What is OmniGen : https://github.com/VectorSpaceLab/OmniGen

Windows Requirements
Python 3.10.11, CUDA 12.4, Git, FFMPEG, cuDNN 9.x, C++ Tools

A tutorial that shows how to install all above : https://youtu.be/DrhUHnYfwC0

How To Install & Use
After installing requirements by following above tutorial, double-click Windows_Install.bat and install
After that use Windows_Start.bat to start the app

When offload_model is enabled (checked) on the Gradio interface, it uses 5.4 GB VRAM, 2x slower

When offload_model is not used (not checked) it uses 12.2 GB VRAM

When separate_cfg_infer is not checked, and offload_model is not checked, it uses 18.7 GB VRAM

To install on RunPod and Massed Compute please follow Massed_Compute_Instructions_READ.txt and Runpod_Instructions_READ.txt

Look at the examples on the Gradio interface closely to understand how to use

View all posts