AnimateDiff Installation (ComfyUI/Automatic1111)

install animatediff for stable diffusion1.5

AnimateDiff is a Text-to-video model that is really powerful and becoming popular. The community is generating quite incredible videos and they are gaining huge popularity. 

Here, we will give you the installation and workflow to work with all the minute settings required to make your generation more powerful. This workflow uses Stable diffusion 1.5 as the checkpoint. For Stable Diffusion XL, follow our AnimateDiff SDXL tutorial.

Table of Contents:


Installing in ComfyUI:

1. Install ComfyUI on your machine. Open the ComfyUI manager and click on "Install Custom Nodes" option. 

install animatediff from comfy manager list

Search for "animatediff" in the search box and install the one which is labeled by "Kosinkadink". Then restart ComfyUI to take effect.


Downloading models

2. Now time to download three relevant motion models shown in the above image from the Hugging Face repository.

After downloading save them inside the "ComfyUI_windows_portable/custom_nodes/ComfyUI-AnimateDiff-Evolved/models" folder.

These are advanced features that give you an extra edge over editing and manipulating your video rendering process. You can download the other motion LoRA models which help you to do camera motion panning, zooming, tilting, and rolling movement. 

Save them inside "ComfyUI_windows_portable/custom_nodes/ComfyUI-AnimateDiff-Evolved/models/MotionLoRA" folder.


Download VAE models

3. Again download the SD Vae (vae-ft-mse-840000-ema-pruned.safetensors file) model from the Hugging Face repository and restart ComfyUI. Put the downloaded files into "ComfyUI_windows_portable/ComfyUI/models/vae" folder.


download animatediff workflow

4. Now download the basic workflows that we have enlisted in our Hugging Face repository. You can try any of the workflows with multiple frames. We are using basic Text2Img workflow.


download dreamshaper model

5. Download the relevant Stable Diffusion1.5 checkpoints. For illustration, we are using Dreamshaper (fine-tuned on SD1.5) model. You can choose your favorite one. Basic recommended settings for the Dreamshaper as stated below:

  • Primitive Node- Setting to randomize will give you random generation each time you generate.
  • Empty Latent Image Node- Change the dimension of your generated output. Don't go beyond the dimension as recommended for Stable Diffusion 1.5(SD1.5). Selecting it to 512 or 768 is the better option to test with.
  • Batch Size- Adds the number of frames you want to generate with. Its constrained to what motion model you are using. Here, we have selected "mm_sd_v14" motion model from AnimateDiff loader node which have maximum 24 frame as value. But 15-20 gives you much better results.
  • KSampler Node- Steps = 30, CFG = 5 or 6, Sampler = DPMPP_2M_SDE, Scheduller = Karras.
  • AnimateDiff Combine node- Frame Rate=8 or 10(Max-16)

Now, just put your positive and negative prompts. You can also try our Stable Diffusion Prompt Generator to get an idea. Click the "Queue" to start generation. The generation time will depend on your workflow settings and machine.

6. After, the generation you can get the results in frames (as GIF image) inside "ComfyUI_windows_portable/ComfyUI/output" folder. The relevant PNG image stores the relevant workflow setting as metadata.

Lets say, you want to work with that, then simply drag-drop the image and you will get your workflow instantly.

To get the converted video(in MP4 format), simply select the GIF-generated image, load it into any video editing tool, and do the conversion. An online converter can be one of the options.

To get the perfect results, you need to do multiple tries with tweak the settings as desired. In case of upscaling, you can use the Upscale node(which consumes more time) or any third-party editing tools.


Installing in Automatic1111:

1. First, you should install Automatic1111 on your machine.

download ToonYou model

2. Here, we are using ToonYou (fined tuned on Stable Diffusion1.5) from CivitAI as the supporting checkpoints. 

Make sure to use the Stable Diffusion 1.5 fine tuned model as checkpoints. You can choose your favorite one. Save it inside "stable-diffusion-webui/model/Stable-diffusion" folder.

3. Back to Automatic1111, head over to the "Extension" tab, then click "Available", again click "Load from". 

Search for "animatediff" in to the search box and there will extension named "sd-webui-animatediff" Click "Install" button to start the installation.

Once installed just click "Apply and restart UI" to get it to work.

download animatediff models

4. Next is to download the animatediff models from the Hugging Face repository. Save them inside "stable-diffusion-webui/extension/sd-webui-animatediff/model" folder.

5. Move back to Automatic1111, a new tab will be shown "AnimateDiff" above the ControlNet tab. Just click on the tab to open it. 

Put your positive and negative prompts into the prompt box. Set the recommended setting provided for your model checkpoints. You can get these from CivitAi's specific model page. For the ToonYou model, we have used these recommended settings:

  • VAE is included starting with Alpha2 (840000)- Clip skip: 2, CFG scale: 8
  • Sampler: DPM++ SDE Karras
  • Sampling Steps: 30+
  • Upscaler (Hires. fix): R-ESRGAN 4x+ Anime6B, Hires steps: 14, Denoising strength: 0.35, Upscale by: 1.5+
  • Default prompts : Pos: (best quality, masterpiece), Neg: (worst quality, low quality, letterboxed)

6. To make it animate navigate to "AnimateDiff" tab. Enable it, then change the number of frames around 16-18 and FPS to 8-10 to higher for longer generation and select the check box on save as GIF. Finally, just click on "Generate".

To generated more smooth frames, you should use higher number of frames and FPS, but will take longer rendering time. Here the results will be generate in GIF (image)format. So, to get into Mp4 (Video format) you can use any tool for conversion.



Conclusion:

AnimateDiff is a text-to-video model that can be used in Stable Diffusion WebUIs to create stunning AI-generated videos.