Traditional diffusion models uses various mechanisms for image modification like ControlNet, IP-Adapter, Inpainting, Face detection, pose estimation, cropping etc. Omnigen released by Vector Space labs comes with all in one pack.
It uses arbitrarily multi-modal instructions like we use to do with ChatGPT (NLP technique). Interested people can refer their research paper for in-depth understanding. Its come with fine-tuning capability that is one of the great news for the community. They are working to release more optimized model in the coming future that can be from their hugging Face repository.
Table of Contents:
Installation
1. Install ComfyUI if have not yet installed.
2. Move to your ComfyUI Manager select "Custom Nodes Manager" and search for "ComfyUI-Omnigen" by author "1038lab" and click Install button.
Alternative:
You can also do the manual installation. Move to your "ComfyUI/custom_nodes" directory.
Open command prompt by typing "cmd" on the top of folder address bar. Then, clone the repository using command provided below:
git clone https://github.com/1038lab/ComfyUI-OmniGen.git
3. The related model will automatically downloaded in the background if you run the basic workflow for the first time. You can check the real-time downloading status in the ComfyUI's terminal.
Alternative:
Manually download the respective model(model.safetensors) having 15.5GB size from Hugging Face repository.
After downloading, rename it to anything relatable like- "omnigen-all-in-one.safetensors". Then, save it inside "Comfyui/models/LLM/OmniGen-v1" folder.
4. Restart ComfyUI to take effect.
Workflow
Omnigen image 1 node |
Omnigen image 2 node |
Omnigen image combined node |