SAM2: Track Objects in Image and Video (ComfyUI)

Tracking objects with precision in images and videos is one of the challenging tasks. SAM2 (Sement Anything Model V2) is an open-source model released by MetaAI, registered under Apache2.0 license. This model ensures more accuracy when working with object segmentation with videos and images when compared with the SAM (older model).

The model is trained on ~51,000 real-world videos and ~600,000 masklets (spatio-temporal masks).

Now, this is a helpful technique when you want to do some kind of image alteration, real-time video editing, face swaps, masking, building robust computer vision systems etc. Lets see how to install and work with ComfyUI.

Table of Contents:

Installation:

1. First you need to do the ComfyUI installation setup if you are new to Comfy.

2. Install the custom nodes by Kijai. Move to "ComfyUI/custom_nodes" folder. Open command prompt using "cmd" on your folder address bar.

Clone the SAM2 repository using following command:

~~git clone https://github.com/kijai/ComfyUI-segment-anything-2.git~~

Download any of the models from Hugging Face repository. There are multiple options you can choose with: Base, Tiny, Small, Large. Save the respective model inside "ComfyUI/models/sam2" folder. Create a "sam2" folder if not exist.

Alternative:

Navigate to ComfyUI Manager and Select "Custom nodes manager".

Search for custom nodes "Segment Anything 2" labeled by Kijai. All the models will be downloaded automatically when you run the workflow for the first time.

Remember it will consume some time as the relevant models are downloaded in the background. It can checked in the command prompt running in the background.

3. Restart ComfyUI to take effect.

Workflow:

1. Get the workflow from your "ComfyUI-segment-anything-2/example_workflows" folder. Alternatively, you can download it from the Github repository. These are different workflows you get-

(a) florence_segment_2 - This supports detecting individual objects and bounding boxes in a single image with the Florence model.

(b) image_batch_bbox_segment - This is helpful for batches and masks with the single-image segmentor.

Choose the one you want to work with. Drag and drop into ComfyUI.

2. Load your image/video into the Load node.

3. Load your relevant SAM2 Model from SAM2 node.

3. Do the segmentation by object selection. The selection starts from 0 which allows the first selection and this goes on.

Now, you will be wondering why even we need this. Actually, this a great methodology if you are working with inpainting, face swapping without losing precision. This also helps in mask creation in real-time video that acts as a supporter for other workflows like adding VFX in AI videos.

Some Limitations:

1. Loses track of objects in challenging scenarios (viewpoint changes, occlusions, crowded scenes, extended videos)

2. Confuses similar-looking objects in crowded scenes

3. Decreased efficiency when segmenting multiple objects simultaneously

4. Misses fine details in fast-moving objects

5. Lacks temporal smoothness in predictions

6. Verifying masklet quality by humans required

7. Selecting frames requiring correction

SAM2: Track Objects in Image and Video (ComfyUI)

Installation:

Workflow:

Some Limitations:

Posted by Administrator

Search This Blog

Popular Posts

Z Image Turbo (BF16/FP8/GGUF) in ComfyUI

Flux.2 Dev (BF16/FP8/GGUF) Setup & Install in ComfyUI

Wan2.2 VideoGen locally in ComfyUI (FP16/FP8/GGUF)

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Wan 2.2 LoRA Training (Windows/Linux)

Install Forge Neo WebUI- Better than Forge & Automatic1111

Important Pages

Our Social Page

Recent Post

Contact form

SAM2: Track Objects in Image and Video (ComfyUI)

Installation:

Workflow:

Some Limitations:

Posted by Administrator

Related Posts

Search This Blog

Our Social Community

Popular Posts

Z Image Turbo (BF16/FP8/GGUF) in ComfyUI

Flux.2 Dev (BF16/FP8/GGUF) Setup & Install in ComfyUI

Wan2.2 VideoGen locally in ComfyUI (FP16/FP8/GGUF)

Easy Install ComfyUI Portable (Windows/Mac/Linux)

Wan 2.2 LoRA Training (Windows/Linux)

Install Forge Neo WebUI- Better than Forge & Automatic1111

Important Pages

Our Social Page

Recent Post

Contact form