了解实例分割如何通过像素级精度改进目标检测,从而为 AI 应用启用详细的目标掩码。
Instance segmentation is a sophisticated technique in computer vision (CV) that identifies and delineates each distinct object of interest within an image at the pixel level. While standard object detection localizes items using rectangular bounding boxes, instance segmentation takes the analysis deeper by generating a precise mask for every detected entity. This capability allows artificial intelligence (AI) models to distinguish between individual objects of the same class—such as separating two overlapping people—providing a richer and more detailed understanding of the visual scene compared to simpler classification methods.
To fully grasp the utility of instance segmentation, it is helpful to differentiate it from other related image processing tasks. Each method offers a different level of granularity depending on the application requirements.
Modern instance segmentation models typically rely on advanced deep learning (DL) architectures, particularly Convolutional Neural Networks (CNNs). These networks extract features from an image to predict both the class of an object and its spatial contour. Historically, two-stage architectures like Mask R-CNN were the standard, first proposing regions of interest and then refining them into masks.
However, recent advancements have led to single-stage detectors like YOLO26, which perform detection and segmentation simultaneously. This "end-to-end" approach significantly improves real-time inference speeds, making it possible to apply high-precision segmentation to live video streams on consumer hardware.
The precise boundaries provided by instance segmentation are critical for industries where understanding the exact shape and position of an object is necessary for decision-making.
开发人员可以使用 ultralytics library. The following
example demonstrates how to load a pre-trained YOLO26 model
and generate segmentation masks for an image.
from ultralytics import YOLO
# Load a pre-trained YOLO26 instance segmentation model
# The 'n' suffix denotes the nano version, optimized for speed
model = YOLO("yolo26n-seg.pt")
# Run inference on an image
# This predicts classes, bounding boxes, and masks
results = model("https://ultralytics.com/images/bus.jpg")
# Visualize the results
# Displays the image with overlaid segmentation masks
results[0].show()
While powerful, instance segmentation is computationally intensive compared to simple bounding box detection. Generating pixel-perfect masks requires significant GPU resources and precise data annotation. Annotating data for these tasks involves drawing tight polygons around every object, which can be time-consuming.
To streamline this process, teams often use tools like the Ultralytics Platform, which offers features for dataset management, auto-annotation, and cloud-based training. This allows developers to fine-tune models on custom data—such as specific industrial parts or biological specimens—and deploy them efficiently to edge AI devices using optimized formats like ONNX or TensorRT.