Mochi1: Generate high quality Consistent Videos

Mochi 1, an open-source text-to-video diffusion model has been released by Genmo. Trained with 10 billion parameters built on novel Asymmetric Diffusion Transformer (AsymmDiT) architecture that is also flexible to fine tune. The model is capable of generating output with high fidelity and strong prompt adherence.

The model is registered under Apache2.0 license, that means it can be used for research, educational and commercial purposes.

Introducing Mochi 1 preview. A new SOTA in open-source video generation. Apache 2.0.

magnet:?xt=urn:btih:441da1af7a16bcaa4f556964f8028d7113d21cbb&dn=weights&tr=udp://tracker.opentrackr.org:1337/announce pic.twitter.com/YzmLQ9g103
— Genmo (@genmoai) October 22, 2024

Currently, it needs minimum 4 H100 GPU which is really huge for any individual to run, but they also inviting the community to release quantized model so that it easily accessible by the lower end users.

It can be run in ComfyUI, as it consume about 20GB VRAM in the VAE(Variational Auto Encoder) decoding level.

Installation

1. Install ComfyUI into machine.

2. Navigate to "ComfyUI/custom_nodes" folder. Open command prompt using "cmd". Clone the repository by typing following command:

~~git clone https://github.com/kijai/ComfyUI-MochiWrapper.git~~

All the respective models gets auto downloaded from Kijai's Hugging Face when you initiate the Workflow for the first time.

If you are interested to work with the raw model, then you can directly access it from Genmo's Hugging face.

Take it into consideration, the model weights is quite huge in size. So, be patient while its getting downloaded. You can track the real-time status in terminal for ComfyUI running in the background.

All the respective models get saved to "ComfyUI/models/diffusion_models/mochi" folder and VAE to "ComfyUI/models/vae/mochi" folder.

Optional(for windows users): Here, we want to mention is that you can install Windows Triton and Sage-Attention which will significantly drop the video rendering time to almost 25% as reported by the community.

Install Windows Trition .whl file for your python version. We have python 3.10 version installed. For other python version checkout Windows Trition release section.

For normal ComfyUI user:

~~pip install triton-3.1.0-cp310-cp310-win_amd64.whl~~

~~pip install sageattention~~

For Comfy Portable users(move inside ComfyUI_windows_portable folder and open command prompt):

~~.\python_embeded\python.exe -m pip install triton-3.1.0-cp310-cp310-win_amd64.whl~~

~~.\python_embeded\python.exe -m pip install sageattention~~

Workflow

1. You can get the Workflow from your "ComfyUI-MochiWrapper/examples" folder.

2. Just drag and drop to ComfyUI.

3. Put your positive detailed prompt for better result.

We do not have H100 stacks, but we tested this model on RTX 4090. The video consistency was really impressive as compared to CogVideoX. But this is massive, eats a lot of your VRAM.

With torch compiled support, gguf q8 enabled its quite lower, about 40 minutes with 200 steps. We hope there will be better quantization support in the future.

Apart from this, they will going to add support for image to video as well.

Mochi1: Generate high quality Consistent Videos

Installation

Workflow

Posted by Administrator

Search This Blog

Popular Posts

Wan2.2 VideoGen locally in ComfyUI (FP16/FP8/GGUF)

Qwen Image Edit 2509 GGUF/fp8/Bf16 Multi Image Editing

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Installing Stable Diffusion 3.5 Locally

Wan 2.1: Install & Generate Videos locally (FP16/FP8/GGUF)

Run Stable Diffusion 10x faster on AMD GPUs

Important Pages

Our Social Page

Recent Post

Contact form

Mochi1: Generate high quality Consistent Videos

Installation

Workflow

Posted by Administrator

Related Posts

Search This Blog

Our Social Community

Popular Posts

Wan2.2 VideoGen locally in ComfyUI (FP16/FP8/GGUF)

Qwen Image Edit 2509 GGUF/fp8/Bf16 Multi Image Editing

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Installing Stable Diffusion 3.5 Locally

Wan 2.1: Install & Generate Videos locally (FP16/FP8/GGUF)

Run Stable Diffusion 10x faster on AMD GPUs

Important Pages

Our Social Page

Recent Post

Contact form