Yolo Vision Shenzhen
Шэньчжэнь
Присоединиться сейчас
Глоссарий

Сегментация экземпляров

Узнайте, как instance segmentation уточняет обнаружение объектов с точностью до пикселя, обеспечивая детальные маски объектов для приложений ИИ.

Instance segmentation is a sophisticated technique in computer vision (CV) that identifies and delineates each distinct object of interest within an image at the pixel level. While standard object detection localizes items using rectangular bounding boxes, instance segmentation takes the analysis deeper by generating a precise mask for every detected entity. This capability allows artificial intelligence (AI) models to distinguish between individual objects of the same class—such as separating two overlapping people—providing a richer and more detailed understanding of the visual scene compared to simpler classification methods.

Различение типов сегментации

To fully grasp the utility of instance segmentation, it is helpful to differentiate it from other related image processing tasks. Each method offers a different level of granularity depending on the application requirements.

  • Semantic Segmentation: This approach classifies every pixel in an image into a category (e.g., "road," "sky," "car"). However, it does not distinguish between separate objects of the same category. If three cars are parked next to each other, semantic segmentation views them as a single "car" region.
  • Instance Segmentation: This method treats each object as a unique entity. It detects individual instances and assigns a unique label to the pixels of each one. In the example of parked cars, instance segmentation would create three distinct masks, identifying "Car A," "Car B," and "Car C" separately.
  • Panoptic Segmentation: A hybrid approach that combines the background labeling of semantic segmentation with the countable object identification of instance segmentation.

Механика анализа на уровне пикселей

Modern instance segmentation models typically rely on advanced deep learning (DL) architectures, particularly Convolutional Neural Networks (CNNs). These networks extract features from an image to predict both the class of an object and its spatial contour. Historically, two-stage architectures like Mask R-CNN were the standard, first proposing regions of interest and then refining them into masks.

However, recent advancements have led to single-stage detectors like YOLO26, which perform detection and segmentation simultaneously. This "end-to-end" approach significantly improves real-time inference speeds, making it possible to apply high-precision segmentation to live video streams on consumer hardware.

Применение в реальном мире

The precise boundaries provided by instance segmentation are critical for industries where understanding the exact shape and position of an object is necessary for decision-making.

  • AI in Healthcare: In medical diagnostics, identifying the exact size and shape of tumors or lesions is vital. Instance segmentation allows models to outline abnormalities in MRI scans with high precision, aiding radiologists in treatment planning and monitoring disease progression.
  • Autonomous Vehicles: Self-driving cars rely on segmentation to navigate complex environments. Utilizing datasets like Cityscapes, vehicles can identify drivable surfaces, recognize lane markings, and separate individual pedestrians in crowded crosswalks to ensure safety.
  • AI in Agriculture: Precision farming uses segmentation to monitor crop health. Robots equipped with vision systems can identify individual fruits for automated harvesting or detect specific weeds for targeted herbicide application, reducing chemical usage and optimizing yield.

Implementing Segmentation with Python

Разработчики могут легко реализовать сегментацию экземпляров с помощью ultralytics library. The following example demonstrates how to load a pre-trained YOLO26 model and generate segmentation masks for an image.

from ultralytics import YOLO

# Load a pre-trained YOLO26 instance segmentation model
# The 'n' suffix denotes the nano version, optimized for speed
model = YOLO("yolo26n-seg.pt")

# Run inference on an image
# This predicts classes, bounding boxes, and masks
results = model("https://ultralytics.com/images/bus.jpg")

# Visualize the results
# Displays the image with overlaid segmentation masks
results[0].show()

Challenges and Model Training

While powerful, instance segmentation is computationally intensive compared to simple bounding box detection. Generating pixel-perfect masks requires significant GPU resources and precise data annotation. Annotating data for these tasks involves drawing tight polygons around every object, which can be time-consuming.

To streamline this process, teams often use tools like the Ultralytics Platform, which offers features for dataset management, auto-annotation, and cloud-based training. This allows developers to fine-tune models on custom data—such as specific industrial parts or biological specimens—and deploy them efficiently to edge AI devices using optimized formats like ONNX or TensorRT.

Присоединяйтесь к сообществу Ultralytics

Присоединяйтесь к будущему ИИ. Общайтесь, сотрудничайте и развивайтесь вместе с мировыми новаторами

Присоединиться сейчас