Wan 2.1 Fun ControlNet is a cutting-edge AI model developed by Alibaba Pal, specifically designed for video generation by transferring instant style . It builds upon the Wan 2.1 framework and introduces two powerful models: Fun Control and Inpaint. These models enable precise motion control, style transfer across frames, and frame-by-frame enhancements all while being optimized for local PCs.
Video To video (Style Transfer) 🙂
— Stable Diffusion Tutorials (@SD_Tutorial) April 12, 2025
Wan Fun Controlnet : https://t.co/ZxeMhXB5wd pic.twitter.com/GPAAl101Sw
The Fun Control model takes motion cues from reference videos (like Instagram/TikTok dance clips) and mimics them in your AI-generated content.
Meanwhile, the Inpaint model lets you modify existing video frames with pinpoint accuracy, making it perfect for refining details or adding creative touches.
It gives you advantages like Precision Motion Control, Consistent Style Transfer on low resource usage with detailed Advanced Customization.
Installation
1. Setup the Wan model and workflow (Official by ComfyUI)if you haven't done yet.
2. Update ComfyUI from the manager section by selecting "Update ComfyUI" option. Use "git pull" command using cmd if you found any error while updating it.
3. (a) Download the Wan1.3 billion model (diffusion_pytorch_model.safetensors) from wan's hungging face repository. after downloading save the model inside your "ComfyUI/models/diffusion_models" folder.
Other relevant models (Text Enoders,VAE) already present into the Comfyui if you already using the wan's Txt To video/Image To Video workflows. So, downloading these are not required.
You can also use the Wan 14 billion model for more consistent generation that also delivers awesome quality with lowest morphing but make sure you are using the higher end GPUs.
(b) Kijai's Wan setup is also one of the best alternative. Just use the Kijai's Wan basic workflow and then download the Wan Fun variant from Kijai's repository.
(c) You can also download and use the Wan Fun 14B GGUF variant by City96 for faster generation but make sure you are using the GGUF setup already explained in our Wan GGUF tutorial section.
Workflow
The Wan Fun workflow (Wan2.1-fun-controlnet-workflow.json) can be downloaded from our Hugging face repository.
Here, you can use DepthMaps/Line art/DW Pose preprocessors to control your subject and its objects but there is some certain things you need to know.
Using Line art, you get the better consistency to your generated subject as in with DW pose you can only control the body skeleton and the rest will be flexible and creative. So, this directly depends on your use cases.
(a) Upload your reference image into Load Video Node. Make sure you upload shorter duration videos if you do not have higher end GPUs.
(b) Upload your target image that you want to add motion. Set your image resolution.
(c) Setup Wan Fun control and KSampler settings.
CFG- around 5.0-6
Steps-20
Sampler-Euler, 2DPMpp_2m
(d) Start Generation by clicking on Run button.