HiDream: The Unfiltered Image Generation you need


HiDream another banger after Flux, developed by Vivago AI is making waves and for good reason. It's a powerful, open-source, text-to-image diffusion model with 17 billion parameters, offering top-tier image quality and prompt adherence that rivals paid subscription models.

It's licensed under MIT, making it free for anyone to use and modify, delivering impressive image fidelity and prompt accuracy.


HiDream showcase image generation

Currently, three variants are available- Full (best quality), Developer (balanced), and Fast (for lower VRAMs). Now, the actual model weights released officially needs at least 60 GB VRAM but you can get the quantized variants so that it can run for PCs having 16 GB VRAM. 

For 60GB VRAM users:

(a) HiDream Full - For high quality generation without compromising

(b) HiDream Dev- For development and testing

(c) Hidream Fast - For faster generation


Table of Contents


Installation


New user need to install ComfyUI. Older user need to update it from the Manager section or simply use "git pull" to get the latest ComfyUI version.

Type A: Native Support

1. Setting up HiDream in ComfyUI requires several components, including text encoders, VAE files, and the diffusion model itself. These models require more than 16GB VRAM. Let's break down the process step by step.

HiDream Dev Version

Download hidream_i1_dev_bf16.safetensors and place it in your "ComfyUI/models/diffusion_models" directory.

This "dev" version is slightly lighter on resources, making it a good option if you are concerned about VRAM usage.


HiDream Full Version

Download hidream_i1_full_fp16.safetensors and save it in your "ComfyUI/models/diffusion_models" directory.

The full version offers the complete HiDream experience but will require more VRAM and processing power.

HiDream Fast Version

Download hidream_i1_fast_fp8.safetensors and put it in your "ComfyUI/models/diffusion_models" directory.

2. Now, you need to download four text encoder files given below and place them in your "ComfyUI/models/text_encoders" folder:

(a) clip_l_hidream.safetensors

(b) clip_g_hidream.safetensors

(c) t5xxl_fp8_e4m3fn_scaled.safetensors

(d) llama_3.1_8b_instruct_fp8_scaled.safetensors

You can find all these files in the HiDream text encoders repository. If you're already working with other advanced models, you might already have the T5XXL encoder downloaded. So, its not required.

3. Next, you'll need to download the VAE file (Variational Autoencoder), which should be placed in your "ComfyUI/models/vae" folder.

This is the Flux VAE, so if you have been working with other recent models, you might already have this file in your setup.

4. Restart ComfyUI and refresh it.


Type B: GGUF Variant

1. Install the GGUF custom nodes setup if you have not done yet. Older user who already using Flux GGUF or SD3.5 GGUF are not required to do so. Only update this custom node from the Manager section.

2. Download the HiDream Model from City96 Hugging face repository.

(a) GGUF Full variant

(b) GGUF Dev variant

(c) GGUF Fast variant

Here, models are provided from Q2(for lowest VRAM with low quality generation) to Q8(for higher VRAM with high quality generation). Choose the one that suitable for your system hardware. After downloading save any of them inside "ComfyUI/models/diffusion_models" folder.

3.. Other model files (VAE, text encoders) can be downloaded from Comfy hugging Face. Then save text encoders into ComfyUI/models/text_encoders folder and VAE into ComfyUI/models/vae folder.


Type C: Quantized NF4 Bit


1. Move inside your "ComfyUI/custom_nodes" folder. Clone the repository using the following command by typing on command prompt:

git clone https://github.com/lum3on/comfyui_HiDream-Sampler

2. Install the require Dependencies by moving into ComfyUI folder open command prompt:

For normal ComfyUI users:
pip install -r requirements.txt

For portable Comfyui users, :
.\python_embeded\python.exe -m pip install -r .\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\requirements.txt


3. Download Model Weights from their Hugging Face repository. Currently, three different model weights available. 


For minimum 16GB VRAM Users:

Choose any of them as suitable for your use case and hardware. Download all these files and maintain the same folder structure as defined into their respective repository. 

For instance if you want use the Dev variant just download all the files(models, text-encoders etc) manually from its repository and store all the files, directories/sub directories as mentioned on their repository.

Tips: If you do not want to download all files manually, use the hugging face cli that helps you to auto-download the required files with sub directories setup. We already explained how to setup your Hugging face Cli and cloning model from local/cloud in detailed step by step tutorial.



download HiDream full version



(d) HiDream full (4bit quantized)- For high quality generation without compromising its quality.


download HiDream dev version


(e) HiDream developer(4bit quantized) - For balanced usecase.

download HiDream fast version

(f) Hidream Fast (4bit quantized)- For faster generation with lower VRAM setup.


4. Also download and setup Triton and Sage Attention for faster generation that we already explained in our HunyuanVideo or WAN video tutorials.

5. Restart ComfyUI and refresh it.

Workflow

1. After installing Hi Dream custom nodes you will get the workflow inside  "custom_nodes/comfyui_HiDream-Sampler/sample_workflow" folder.

For Native support, once you have all the files in place, you can load or drag the workflow from our Hugging face repository.

For GGUF workflow, from our hugging face repo and just replace the “Load Diffusion Model” node with "Unet Loader (GGUF)" node.

2. Drag and drop into ComfyUI.

3. Input your text prompts, select model version and text encoder, and generate images. The model typically uses :

Full variant
Steps-50
Shift-30
CFG-5
Sampler-LCM
Scheduler-Normal

Dev variant
Steps-28
Shift-6
CFG-1
Sampler-LCM
Scheduler-Normal


Fast variant 
Steps- 16
Shift-3
CFG-1
Sampler-LCM
Scheduler-Normal

Now, lets test with different style.

Human centric

Human centric generation


Prompt: Portrait of an american supermodel looking straight at the viewer, wearing pastel bold geometric patterns, high fashion outfit, artistic background wall, minimalist portrait, high tonal range, lean towards wall, photoshoot in sunny weather, shining funky shades , 8k



Anime style



anime style image generation

Prompt: Vibrant anime portrait of a young woman with messy black hair, large adorable eyes, wearing funky sunglasses. The lighting is dramatic , casting warm. The background is dark with subtle glowing particles and soft plant silhouettes, evoking an ethereal and otherworldly atmosphere, stylized and highly detailed


Textual Writing

textual writing test

Prompt: A vibrant graffiti mural on a brick wall in an urban alley, spray paint art with bold colors and dynamic shapes, the word "STABLE DIFFUSION" in stylized letters, grungy texture, street photography style


Its to tell you that the generations are not cherry picked and we picked up what we got at its first attempt. Surprisingly, the generation is quite satisfactory. 

But, one drawback is that the model uses hell a lot of VRAM even after using the FP8 or 4bit quantized models.