Kolors is a text-to-image diffusion-based model developed by Kuaishou Kolors Team and is also the official creator of the KlingAI project. This model has been trained on billions of parameters specifically with text-image pairs.
Source: Kolors Hugging Face Repository |
This model is compatible with English and Chinese language released under Apache2.0 license. This means it can be used for research, educational, and commercial purposes but you need to contact their team.
For in-depth information, you can refer to their research paper.
Installation:
1. Install ComfyUI on your local machine.
2. Click "Update All" from the ComfyUI Manager to update ComfyUI.
3. There are two methods to create custom nodes.
(a) Automatic Method:
Move to ComfyUI Manager and search for "ComfyUI-KwaiKolorsWrapper" labeled by author "Kijai" and click the install button to start custom nodes installation.
Then, just restart ComfyUI to take effect.
(b) Manual method:
Navigate into the "ComfyUI/custom_nodes" folder. Move to the folder address bar and type "cmd" to open the command prompt.
Clone the repository by copying and pasting the command into your command prompt provided below:
git clone https://github.com/kijai/ComfyUI-KwaiKolorsWrapper.git
4. Now, install the requirements files by typing these commands (For regular comfy users):
pip install -r requirements.txt
For ComfyUI portable users, use these commands:
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-KwaiKolorsWrapper\requirements.txt
The respective Kolors model(fp16, 16.5GB) and ChatGLm3 get downloaded automatically into "ComfyUI/models/diffusers/Kolors".
Workflow Explanation:
1. Get the workflow from your installation folder by navigating into the folder location "ComfyUI-KwaiKolorsWrapper/examples". It has both the workflows included-
(a) text-to-image
(b) image-to-image
Just drag and drop the workflow into ComfyUI.
2. Recommended settings we used:
Sampling method: Euler
Steps:25
CFG: 5
Resolution:1024 by 1024
3. Load the Kolors model into the "Kolor model" node.
Set adequate options from the "ChatGLM3 node".
Set the text encoder to the FP16 model if you have the minimum 13 GB VRAM, for lower 8-9 GB use quant8, and for quant4 use the lower end 4GBVRAM.
Put prompt into "Kolors Text Encode" node.
4. Click "Queue" to generate images. At the first run, it will take some time to download the dependencies in the background.
Kolors Output1 |
Kolors Output2 |
Prompt used: 3d anime style, portrait photo of a girl, nightlife, raining, uhd, 8k
Here is the output with a really impressive result. Hmm, so the model is more intelligent in understanding the context. We only inputted "raining" and the model added a raincoat into the image.