ComfyUI: Beginner to Advance Guide

ComfyUI full tutorial

Are you confused with other complicated Stable Diffusion WebUIs? No problem, try ComfyUI. 

It is a Node-based Stable Diffusion Web user Interface that assists AI artists in generating incredible art. There are multiple nodes that you can create for your workflow with just a drag-and-drop technique.

 The best part of ComfyUI is you don't get the complicated multiple settings like we have in Automatic1111 which sometimes makes you confused.


ComfyUI Basic Nodes:

We will cover all the basic nodes that are widely used by the community for art generation in detail with examples.



load checkpoints nodes

1. CheckPointLoader: This is one of the common nodes. Adds this by right-clicking on canvas then 
Add node> Loaders>Load checkpoints

load checkpoints nodes by search

Alternatively, you can do this using the search option by left double click on Canvas, search "checkpoint" and selecting "Load checkpoint" option provided.

This node gets an input in the form of a checkpoint. The model checkpoints get stored inside the "ComfyUI_windows_portable\models\checkpoints" folder. So, whenever you want to use 
any Stable Diffusion base models you should always store that into that directory.
It has three outputs i.e. "Clip", "VAE" and "Model" which we discuss further.

deprecated nodes

Now, here sometimes you get a checkpoint with "deprecated" which means it's no longer used by the ComfyUI.

deprecated nodes

primitive nodes

2. Primitive Node: To create this node, right click on checkpoint loader> Click ckpt_name to input > Drag this out > Add Node>Utils>Primitive

Basically, this node name influences the value of the checkpoint you have currently selected.
The "Control Auto Generate" option is used to change the checkpoints multiple times automatically whenever you generate an art. However, it's experimental and not used in normal use cases.

Ksampler node

3. CheckPointLoader(Model output): Generally, it goes to the KSampler node. To connect to Ksmapler, just click and drag to the canvas area to open a dropdown menu where you can select "KSampler".

loading clip nodes as prompt

4. CheckPointLoader(Clip): Clip(Contrastive Language-Image Pre-Training) gets connected to the CLIP Text Encode (Prompt) box where you have to put your positive /negative prompt.

Actually, Clip takes a positive/negative input and using the Tokenization technique breaks it into multiple tokens which are again converted into numbers(in machine learning it's called Conditioning) because machines cannot understand words so it process in only numbers.
For example: If you input "a man" then this gets converted into "1 boy" This is what Tokenization is.


loading the vae node

5. CheckPointLoader(VAE): VAE(Variational Auto Encoder) connect to the VAE Decode box. 

Now, this is optional -You can also load individual nodes by double left-clicking on canvas for the Load VAE, Load Clip, and UNET Loader which actually combine to form "Load checkpoint".

loading checkpoints models

So, whenever you try to load your desired Stable Diffusion models in the ".safetensors" or ".ckpt" extension these need to be loaded on the "Load checkpoint" node. And these models usually comprise UNET, Clip, and VAE.


safetensors file

This means the load checkpoints are responsible for de-assembling all three functions (Unet, Clip, VAE) for the workflow pipeline. 

Let's say we have any image in pixel form (255,255,6) in RGB formation values. but, here the situation is machines can't understand the images like the human eyes do.

percentage of loss in image generation

So, the image pixel values get converted to binary numbers. Now in the machine learning world, when we train a model there is a difference between what we expect and what the result is. Actually here 
the human can differentiate with the image gesture, quality, etc. but the model uses a percentage of Loss.

Usually, this is the way to train any machine learning model. However, if we talk about these image models this is not the optimal way to train it. So, StabilityAI has worked on a solution called Diffusion where an image that is in pixel form gets converted into Latent form or vice-versa using the VAE (Encoder/Decoder).

Ksampler node in workflow

Here, one thing we need to mention is in the process of conversion from pixel to Latent or vice versa it loses the minute amount of image data. So, for example you are working on a big project and there are lots of nodes connected to each other you will lose a lot of information during these conversions.

So, you have to balance between what your requirement is and how many conversions you want to add for getting the optimal result.

basic comfyui workflow

6. Ksampler: The Ksampler has various inputs with a single output. Actually, this is responsible for the sampling process with a bunch of steps. With each step, the denoising of the image gets performed, and each time we get a better image from the earlier one.
Now, we will discuss a lot of options in Ksampler. All has been explained in a detailed simple manner:-
  • Seed: It's normally the initial point where the random value is generated for any particular generated image. After the first generation, if you set its randomness to fixed, the model will generate the same style of image. This is one of the techniques to achieve consistency. Its default value is 0. The minimum value is 0 and the maximum value is in hexadecimal(0xffffffffffffffff) in actual its 1844674407370955. You must set your seed value between this specified range only.
  • Steps: It refers to the inference steps means the number of steps the diffusion mechanism needs to generate an intermediate latent image sample processed in latent space. In each step, the image is processed from noising and denoising attempts to get the perfect output. The default is 20, the minimum value is 0 and the maximum is 10000.
  • CFG: CFG(Classifier Free Guidance scale) It takes the float(decimal) values. Its default value is 8.0, minimum is 0.0 and  100.0 is the maximum value. It basically helps the diffusion models to follow and adds influence with the prompts. But, if you go further higher values you will get a totally different image perspective. So, usually closer to 7-8 is best to get the demanding results.
  • Sampler Name: It's the type of algorithm for noising and denoising the latent image to achieve optimal performance. Many of the popular algorithms are DPM++, Euler, Euler A, etc. 
  • Scheduler: It's the Ksampler's Scheduler for scheduling techniques. 
  • Positive conditioning: The positive prompt we used to generate AI Art. 
  • Negative conditioning: It's the negative prompt that we want don't want in Image generation. 
  • Noise Scheduler: It generally controls how much noise you have in the image it should be in each step. 
  • Denoise factor: This is basically used when you are working in an "image-to-image" workflow. So, during conversion from latent to pixel and vice-versa it detects how much percentage we want to keep and change. If the denoising factor is 1 means 100% we don't want to keep any image but when we give 0.5 means 50% we want to change our image with 50% of as it is.

7. Ksampler Advanced: This node is rarely used for depth workflow where we need many sampling steps with a diffusion latency mechanism. For example: with SDXL1.0 base models we use the refiners model as well.

KSampler Advanced node

Here, we need to use first Ksampler for loading SDXL1.0 base models and the later one with the refiner model.


Its to take in mind that every diffusion model behaves different with its sampler method, its steps, CFG scale, positive, negative conditionings etc.

So, you need to take it into consideration while downloading any diffusion models from Hugging Face or CivitAI. To gather more relevant information you should check their description section to leverage the maximum potential of the required model.

We explained the basic workflow and nodes that are normally used to generate images in the text-to-image process. You can look further and download extra workflows from our Hugging Face repository where we share multiple workflows with different nodes enabled. 


ComfyUI Shortcuts:

Trust us, if you are working deep with your workflows, these shortcuts will work as wonders. At the initial stage, you will face some problems to recall but with regular usage, it will prove to be really a game changer. All the shortcuts have been listed below:

Sl. No Shortcut Key Description
1 Ctrl + Enter Queue up current graph for generation
2 Ctrl + Shift + Enter Queue up current graph as first for generation
3 Ctrl + Z/Ctrl + Y Undo/Redo
4 Ctrl + S Save workflow
5 Ctrl + O Load workflow
6 Alt + A Select all nodes
7 Alt + C Collapse/uncollapse selected nodes
8 Ctrl + M Mute/unmute selected nodes
9 Ctrl + B Bypass selected nodes (acts like the node was removed from the graph and the wires reconnected through)
10 Delete/Backspace Delete selected nodes
11 Ctrl + Backspace Delete the current graph
12 Space Move the canvas around when held and moving the cursor
13 Ctrl/Shift + Click Add clicked node to selection
14 Ctrl + C/Ctrl + V Copy and paste selected nodes (without maintaining connections to outputs of unselected nodes)
15 Ctrl + C/Ctrl + Shift + V Copy and paste selected nodes (maintaining connections from outputs of unselected nodes to inputs of pasted nodes)
16 Shift + Drag Move multiple selected nodes at the same time
17 Alt + +(Plus) Canvas Zoom in
18 Alt + -(minus) Canvas Zoom out
19 Ctrl + Shift + LMB + Vertical drag Canvas Zoom in/out
20 Q Toggle visibility of the queue
21 H Toggle visibility of history
22 R Refresh graph
23 Double-Click LMB Open node quick search palette
24 Ctrl+D Load default graph workflow

If you are a Mac user, then use Command in place of Ctrl key.


Important points:

1. While installing new nodes, if you get some error then always update all the nodes and ComfyUI at the first from the Manager section.
2. Whenever you load a workflow in ComfyUI and you get an error of "missing node" in red-colored, you should search for the specific node through ComfyUI manager and install it from the list.
3. If you want to download any node then there are basically two options:
   (a) First is from the search bar of ComfyUI.
   (b) Second is using the Hugging face/Github repository.
The latter one is used when the functions are new and ComfyUI hasn't updated their list.
4. Now, you can embed ComfyUI workflow metadeta inside any generated image. So, doing this will help you to work with the workflow in a team based collaborative environment.