Installing Stable Diffusion 3.5 Locally

install stable diffuion 3.5 now

So, it's finally here. StabilityAI released Stable Diffusion 3.5 on October 22nd, 2024. After a huge backlash in the community on Stable Diffusion 3, they are back with the improved version.

And yes, this is an uncensored model. But, responsible steps are taken care to prevent the misuse by the bad actors. People can find more detailed knowledge by accessing their research papers.

Basically, three model variants are in the open market:

Model Variant Parameters Resolution Range Key Features Ideal Use Cases
Stable Diffusion 3.5 Large (Base Model) 8 billion 1 megapixel
  • Most powerful model in the SD family
  • Superior image quality and prompt adherence
  • Supports professional-grade outputs
Professional-grade, high-quality image generation at 1 megapixel resolution
Stable Diffusion 3.5 Large Turbo (Distilled model) 8 billion 1 megapixel
  • Distilled version of 3.5 Large
  • Faster generation in just 4 steps
  • Maintains high-quality images with exceptional prompt adherence
Faster image generation with high quality for time-sensitive professional projects
Stable Diffusion 3.5 Medium 2.5 billion 0.25 – 2 megapixels
  • Designed for consumer hardware
  • Features improved MMDiT-X architecture
  • Balanced quality with ease of customization
Consumer-grade projects with flexibility and support for resolutions up to 2 megapixels

All the models' weights are customizable in nature and can be used for free non-commercial and commercial usage up to $1M in annual revenue. The released models are under StabilityAI Community License. Interested AI Startups and Enterprises can check out their license page for in-depth understanding.

Now, lets dive into the installation process in ComfyUI. Currently, their is no update for Automatic1111 but ForgeUI developers are working on it. You can go through their respective pull request if interested. We will be updating it soon whenever it will get officially available.

Apart from this, you can also stylize your image with Controlnets released by StabilityAI.


Table of Contents


Installation

You have to install ComfyUI into your machine if you are new to it. Old users need to update the ComfyUI by navigating to the Manager and select "Update ComfyUI" and "Update All". We have listed various releases by the community. You can choose as per your requirements and system configuration.

TYPE A: Stable Diffusion 3.5 by StabilityAI

Users who have the powerful GPU and just want to get their hands dirty with the raw models can use this official variant.

1. All three models variants are as follows:

stable diffusion 3.5 large

(a) Download the model weight from StabilityAI's Hugging face for Stable Diffusion 3.5 Large.

stable diffusion3.5 large turbo

(b) Download the model weight from StabilityAI's Hugging face for Stable Diffusion 3.5 Large Turbo.


download stable diffusion3.5 medium

(c) Download the model weight for Stable Diffusion 3.5 Medium from StabilityAI's Hugging face.


accept stabilityAI license and agreement

2. While downloading for the first time, you need to accept their license and agreement for accessing their repository.

3. After downloading save them inside "ComfyUI/models/checkpoint" folder.

5. Now, download the clip models (clip_g.safetensors, clip_l.safetensors, and t5xxl_fp16.safetensors) from StabilityAI's Hugging Face and save them inside "ComfyUI/models/clip" folder. 

As Stable Diffusion 3.5 uses the same clip models, you do not need to download if you are a Stable Diffusion 3 user. Simply cross check that you have the respective clip models in the required directory or not.

4. Restart ComfyUI to take effect.


TYPE B: Stable Diffusion 3.5 GGUF Quantized version

The Quantized variant produces quite good quality without compromising the extensive image pixels. This is great for Mac M1/M2 enabled chip machines.

It consumes low GPU consumption with lesser rendering time. Specially the GGUF Loader works on GPU to improve the overall performance of your VRAM. T5 text encoder is used to lower the VRAM power consumption.

1. Update ComfyUI from Manager by clicking on "Update All".

2. Move to "ComfyUI/custom_nodes" folder. Navigate to the folder path location and type "cmd" to open command prompt.

GGUF Flux user who already have this repository installed are not required to install again. Just update it. Move to "ComfyUI/custom_nodes/ComfyUI-GGUF" folder, open command prompt and type "git pull". Download the SD 3.5 quantized model weights as mentioned in step 5 and save it to the respective folder.

3. Then install and clone the repository by copying and paste into command prompt provided below:

git clone https://github.com/city96/ComfyUI-GGUF.git

4. For portable user, move to "ComfyUI_windows_portable" folder. Navigate to folder path location and type "cmd" to open command prompt.

Use this command to install the dependencies:

git clone https://github.com/city96/ComfyUI-GGUF ComfyUI/custom_nodes/ComfyUI-GGUF .\python_embeded\python.exe -s -m pip install -r .\ComfyUI\custom_nodes\ComfyUI-GGUF\requirements.txt

5. There are multiple models listed in the respective repository. Download any one of the the Pre-quantized models provided below:

(a) Stable Diffusion 3.5 Large GGUF 

(b) Stable Diffusion 3.5 Large Turbo GGUF 

(c) Stable Diffusion 3.5 Medium GGUF 

Save them to the "ComfyUI/models/unet" directory. Here, all the clip models are already handled by the CLIP loader. So, downloading this is not required.

But, in case you want you can download as per your GGUF (t5_v1.1-xxl GGUF )models from Hugging Face and save it into "ComfyUI/models/clip" folder.

6. Then restart and refresh ComfyUI to take effect. 


Workflow

1. You can download all the respective workflows from StabilityAI's Hugging face.

2. Drag and drop any of the workflows to ComfyUI.

For GGUF model variant workflow: All the configurations will be same here. You just need to replace the "Load Diffusion Model" node with "UNet Loader (GGUF)" node. Then, connect it to "Model Sampling SD3" node and "Basic scheduler" node.


Model checkpoint loader

3. Load a relevant model checkpoint from loader node.

4. Set the relevant configuration from the KSampler node.

KSampler configuration

Recommended settings by StabilityAI:

For Stable Diffusion 3.5 Large:
CFG:4.5
Steps: 28-40
Sampler: Euler, SGM Uniform

For Stable Diffusion 3.5 Large turbo:
CFG:1
Steps: 4
Sampler: Euler, SGM Uniform



We have tested the official Stable Diffusion 3.5 Large with the recommended settings.

stable diffusion 3.5 generation
Generated using Stable Diffusion 3.5 Large


Here is the result. Frankly, the result is not cherry-picked. This is the first generation we have. Not so bad. But the dress is little unrealistic.

Prompt used: 
portrait , high quality creative photoshoot of a model ,  black dress, bluish  hair, red lipstick, wearing funky sunglasses,  Vogue style,  fashion photoshoot, professional makeup, uhd

CFG: 4.5
Resolution: 1024 by 1024
Steps: 28

We are using NVIDIA RTX 4090 and the generation time was 22 seconds.

You can get more prompt ideas from our image prompt generator which is specifically designed for generating images using Stable Diffusion Models. 

The model has been trained with NLP (natural language processing) which uses clip based models. So, you can use any LLM like Tipo LLM extension etc. or GPT4 based LLMs that will improve and generate natural human like prompting.

Let's try another one. Like a haunting movie scene with different configurations.


stable diffusion3.5 generation
Generated using Stable Diffusion 3.5 Large


Prompt used: 
a woman, stands on the roof of a rundown trailer, character with a haunting presence due to her posture, standing in a long dress, gaze locked forward, under ominous weather, realistic, uhd

CFG: 4.5
Resolution: 1024 by 1024
Steps: 40

Here, the result looks quite satisfactory. The prompt adherence worked really well. What we experienced is, detailed prompting render quite decent results.

Now, lets try images with human fingers and test how it performs. 

human fingers image generation testing
Generated using Stable Diffusion 3.5 Large

human fingers image generation testing
Generated using Stable Diffusion 3.5 Large

human fingers image generation testing
Generated using Stable Diffusion 3.5 Large


Prompt used: a beautiful girl with her fingers on her face, hyper realistic, 8k, ultra detailed
CFG: 4.5
Resolution: 1024 by 1024
Steps: 40

Yes, we tried thrice but little weird, deformation is still there. Not so impressive but better than Stable Diffusion 3.


This time lets test Stable Diffusion 3.5 Medium that is capable to run on low VRAM.

stable diffusion3.5 medium generation
Generated using Stable Diffusion 3.5 Medium

Prompt used: An astronaut floating in space, surrounded by pink flora and planets, a detailed illustration, retro futuristic, children's book illustration style, close-up intensity, hyper-realistic details, a blue sky on a bright day, wide-angle, full-body shot, and bold lines in a pop art style, flat pastel colors
CFG: 4.5
Resolution: 1024 by 1024
Steps: 40


Another try with textual art.

stable diffusion 3.5 medium generation
Generated using Stable Diffusion 3.5 Medium

Prompt used: an image of a woman standing in the middle of busy road holding a sign board with text: "SD 3.5 Medium"
CFG: 4.5
Resolution: 1024 by 1024
Steps: 40

Too many hands. Text is not that perfect and lots of artifacts.

Now, many people will argue that Stable Diffusion 3.5 is not as great as the Flux. Actually, it depends who you are and how you use the model. 

Well, technically its incorrect. Why, because you need to take in consideration that, unlike Black Forest Labs, the StabilityAI released the base variant of Stable Diffusion 3.5 whereas Flux Pro is the base model but inaccessible to open community. 

So, the community has the wide opportunity to improve the Stable Diffusion 3.5 but not Flux where we have only distilled one(Dev and Schnell).

Conclusion

Not only StabilityAI, but also emerging startups like Black Forest Labs and others joined in the AI race. While some AI companies prefer to keep their work private in pursuit of profits, the increasing diversity of players in the market is breaking down monopolistic tendencies. 

This shift from concentrated control among a few top companies to a more competitive landscape ultimately benefits end users. The rise of new startups showcasing their innovative AI models creates a win-win situation for the community, driving both innovation and accessibility.