Learn how interactive segmentation uses human-in-the-loop prompts to isolate objects. Discover how to use Ultralytics YOLO26 and the Ultralytics Platform for tasks.
Interactive segmentation is a highly collaborative approach to computer vision where a human user provides continuous or single-shot input—such as clicks, bounding boxes, or text prompts—to guide an AI model in isolating specific objects within an image. Unlike fully automated methods, this human-in-the-loop technique allows users to define exactly what needs to be segmented, making it especially valuable when dealing with ambiguous visual data, overlapping objects, or unseen classes. Over the past few years, the introduction of foundational models has drastically improved the speed and accuracy of this process, turning it into a vital tool for data annotation and precision imaging.
At its core, the workflow relies on promptable concept segmentation, where the model interprets user guidance to generate a pixel-perfect mask. A user might place a "positive" click on the foreground object they want to select and a "negative" click on background areas they want to exclude. Advanced models like the Segment Anything Model (SAM) and its successors, Meta SAM 3, take this further by accepting diverse gesture types [1], bounding boxes, and even text descriptions to ground the visual search. The model calculates the optimal boundary based on these prompts, and the user can iteratively refine the mask with additional clicks until the desired accuracy is achieved.
Interactive segmentation is transforming workflows across numerous industries by blending human expertise with AI efficiency.
While both concepts involve separating objects at the pixel level, they serve different operational purposes. Instance segmentation is typically a fully automated process where a model, like Ultralytics YOLO26, detects and outlines predefined classes (e.g., "car," "person," "dog") without user intervention. You can learn more about how this works in our guide to instance segmentation.
Conversely, interactive segmentation does not strictly rely on predefined classes. It is class-agnostic, meaning it segments whatever the user points to, making it an excellent fit for active learning pipelines where novel objects need to be rapidly annotated and added to custom datasets using tools like the Ultralytics Platform.
You can easily implement interactive segmentation in your own projects using
PyTorch and the ultralytics Python package. In this
example, we use FastSAM to segment a specific object by
providing a bounding box prompt.
from ultralytics import FastSAM
# Load a pretrained FastSAM model
model = FastSAM("FastSAM-s.pt")
# Perform interactive segmentation using a bounding box prompt [x1, y1, x2, y2]
results = model("path/to/image.jpg", bboxes=[100, 100, 300, 300])
# Display the segmented result on screen
results[0].show()
This snippet demonstrates how a simple spatial prompt directly guides the model to isolate the region of interest, streamlining complex image segmentation tasks with minimal code.


Begin your journey with the future of machine learning