SAM2: Track Objects in Image and Video (ComfyUI)

install and use segment anything v2 in comfyui

Tracking objects with precision in images and videos is one of the challenging tasks. SAM2 (Sement Anything Model V2) is an open-source model released by MetaAI, registered under Apache2.0 license. This model ensures more accuracy when working with object segmentation with videos and images when compared with the SAM (older model).

Source: MetaAI

The model is trained on ~51,000 real-world videos and ~600,000 masklets (spatio-temporal masks). You can get more in-depth knowledge from officially released article.

Now, this is a helpful technique when you want to do some kind of image alteration, real-time video editing, face swaps, masking, building robust computer vision systems etc. Lets see how to install and work with ComfyUI.

Table of Contents:

Installation:

1. First you need to do the ComfyUI installation setup if you are new to Comfy.

2. Install the custom nodes by Kijai. Move to "ComfyUI/custom_nodes" folder. Open command prompt using "cmd" on your folder address bar.

Clone the SAM2 repository using following command:

~~git clone https://github.com/kijai/ComfyUI-segment-anything-2.git~~

Download any of the models from Hugging Face repository. There are multiple options you can choose with: Base, Tiny, Small, Large. Save the respective model inside "ComfyUI/models/sam2" folder. Create a "sam2" folder if not exist.

Alternative:

Navigate to ComfyUI Manager and Select "Custom nodes manager".

Search for custom nodes "Segment Anything 2" labeled by Kijai. All the models will be downloaded automatically when you run the workflow for the first time.

Remember it will consume some time as the relevant models are downloaded in the background. It can checked in the command prompt running in the background.

3. Restart ComfyUI to take effect.

Workflow:

1. Get the workflow from your "ComfyUI-segment-anything-2/example_workflows" folder. Alternatively, you can download it from the Github repository. These are different workflows you get-

(a) florence_segment_2 - This supports detecting individual objects and bounding boxes in a single image with the Florence model.

(b) image_batch_bbox_segment - This is helpful for batches and masks with the single-image segmentor.

Choose the one you want to work with. Drag and drop into ComfyUI.

Workflow1: With Florence2

samV2 object segmentation in video workflow

Workflow2: Selecting specific object from the entire subject

2. Load your image/video into the Load node.

3. Load your relevant SAM2 Model from SAM2 node.

3. Do the segmentation by object selection. The selection starts from 0 which allows the first selection and this goes on.

Now, you will be wondering why even we need this. Actually, this a great methodology if you are working with inpainting, face swapping without losing precision. This also helps in mask creation in real-time video that acts as a supporter for other workflows like adding VFX in AI videos.

Some Limitations:

1. Loses track of objects in challenging scenarios (viewpoint changes, occlusions, crowded scenes, extended videos)

2. Confuses similar-looking objects in crowded scenes

3. Decreased efficiency when segmenting multiple objects simultaneously

4. Misses fine details in fast-moving objects

5. Lacks temporal smoothness in predictions

6. Verifying masklet quality by humans required

7. Selecting frames requiring correction

SAM2: Track Objects in Image and Video (ComfyUI)

Installation:

Workflow:

Some Limitations:

Posted by Admin

Search This Blog

Trending

Wan 2.1: Install & Generate Videos locally with lower VRAM

Train your WAN2.1 Lora model on Windows/Linux

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Wan2.1 FusionX 14B: Consistent Fast VideoGen with Low VRAM

Installing Stable Diffusion 3.5 Locally

Run Stable Diffusion 10x faster on AMD GPUs

Our Social Pages

Recent Posts

Important pages

Contact form

SAM2: Track Objects in Image and Video (ComfyUI)

Installation:

Workflow:

Some Limitations:

Posted by Admin

Related Posts

Search This Blog

Trending

Wan 2.1: Install & Generate Videos locally with lower VRAM

Train your WAN2.1 Lora model on Windows/Linux

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Wan2.1 FusionX 14B: Consistent Fast VideoGen with Low VRAM

Installing Stable Diffusion 3.5 Locally

Run Stable Diffusion 10x faster on AMD GPUs

Our Social Community

Our Social Pages

Recent Posts

Important pages

Contact form