Yolo Vision Shenzhen
Shenzhen
Join now
Glossary

Interactive Segmentation

Learn how interactive segmentation uses human-in-the-loop prompts to isolate objects. Discover how to use Ultralytics YOLO26 and the Ultralytics Platform for tasks.

Interactive segmentation is a highly collaborative approach to computer vision where a human user provides continuous or single-shot input—such as clicks, bounding boxes, or text prompts—to guide an AI model in isolating specific objects within an image. Unlike fully automated methods, this human-in-the-loop technique allows users to define exactly what needs to be segmented, making it especially valuable when dealing with ambiguous visual data, overlapping objects, or unseen classes. Over the past few years, the introduction of foundational models has drastically improved the speed and accuracy of this process, turning it into a vital tool for data annotation and precision imaging.

How Interactive Segmentation Works

At its core, the workflow relies on promptable concept segmentation, where the model interprets user guidance to generate a pixel-perfect mask. A user might place a "positive" click on the foreground object they want to select and a "negative" click on background areas they want to exclude. Advanced models like the Segment Anything Model (SAM) and its successors, Meta SAM 3, take this further by accepting diverse gesture types [1], bounding boxes, and even text descriptions to ground the visual search. The model calculates the optimal boundary based on these prompts, and the user can iteratively refine the mask with additional clicks until the desired accuracy is achieved.

Real-World Applications

Interactive segmentation is transforming workflows across numerous industries by blending human expertise with AI efficiency.

  • Medical Imaging: In AI in healthcare, doctors and radiologists use interactive tools to isolate tumors, lesions, or specific organs in MRI and CT scans. Research into spatial modeling for medical images [2] shows that interactive clicks allow medical professionals to quickly correct AI predictions, ensuring the rigorous precision required for patient diagnosis.
  • Geospatial and Satellite Mapping: Urban planners and environmental scientists use interactive models to accelerate GIS feature extraction [3]. Instead of manually tracing complex coastlines, agricultural boundaries, or new infrastructure, analysts can place a few strategic clicks to instantly generate accurate geographic polygons.
  • Industrial Defect Detection: For AI in manufacturing, quality control engineers can use interactive prompts to highlight microscopic flaws on production lines, dynamically adapting the system to new types of defects without retraining the entire model.

Interactive Segmentation vs. Instance Segmentation

While both concepts involve separating objects at the pixel level, they serve different operational purposes. Instance segmentation is typically a fully automated process where a model, like Ultralytics YOLO26, detects and outlines predefined classes (e.g., "car," "person," "dog") without user intervention. You can learn more about how this works in our guide to instance segmentation.

Conversely, interactive segmentation does not strictly rely on predefined classes. It is class-agnostic, meaning it segments whatever the user points to, making it an excellent fit for active learning pipelines where novel objects need to be rapidly annotated and added to custom datasets using tools like the Ultralytics Platform.

Example Using Ultralytics

You can easily implement interactive segmentation in your own projects using PyTorch and the ultralytics Python package. In this example, we use FastSAM to segment a specific object by providing a bounding box prompt.

from ultralytics import FastSAM

# Load a pretrained FastSAM model
model = FastSAM("FastSAM-s.pt")

# Perform interactive segmentation using a bounding box prompt [x1, y1, x2, y2]
results = model("path/to/image.jpg", bboxes=[100, 100, 300, 300])

# Display the segmented result on screen
results[0].show()

This snippet demonstrates how a simple spatial prompt directly guides the model to isolate the region of interest, streamlining complex image segmentation tasks with minimal code.

Let’s build the future of AI together!

Begin your journey with the future of machine learning