Yolo 비전 선전
선전
지금 참여하기
용어집

전체 영역 분할 (Panoptic Segmentation)

AI 애플리케이션에서 정확한 픽셀 수준의 장면 이해를 위해, 전경 배경 분할(Panoptic Segmentation)이 어떻게 시맨틱 분할과 인스턴스 분할을 통합하는지 알아보세요.

Panoptic segmentation is a comprehensive computer vision (CV) task that unifies two distinct forms of image analysis: semantic segmentation and instance segmentation. While traditional methods treat these tasks separately—either classifying background regions like "sky" or "grass" generally, or detecting specific objects like "car" or "person"—panoptic segmentation combines them into a single, cohesive framework. This approach assigns a unique value to every pixel in an image, providing a complete scene understanding that distinguishes between countable objects (referred to as "things") and amorphous background regions (referred to as "stuff"). By ensuring that every pixel is accounted for and classified, this technique mimics human visual perception more closely than isolated detection methods.

핵심 개념: 물건 vs. 사물

To fully grasp panoptic segmentation, it is helpful to understand the dichotomy of visual information it processes. The task splits the visual world into two primary categories:

  • Stuff Categories: These represent amorphous regions of similar texture or material that are not countable. Examples include roads, water, grass, sky, and walls. In a panoptic analysis, all pixels belonging to a "road" are grouped into a single semantic region because distinguishing between "road segment A" and "road segment B" is generally irrelevant.
  • Things Categories: These are countable objects with defined geometry and boundaries. Examples include pedestrians, vehicles, animals, and tools. Panoptic models must identify each "thing" as a unique entity, ensuring that two people standing side-by-side are recognized as separate instances (e.g., "Person A" and "Person B") rather than a merged blob.

This distinction is crucial for advanced artificial intelligence (AI) systems, allowing them to navigate environments while simultaneously interacting with specific objects.

How Panoptic Architectures Work

Modern panoptic segmentation architectures typically employ a powerful deep learning (DL) backbone, such as a Convolutional Neural Network (CNN) or a Vision Transformer (ViT), to extract rich feature representations from an image. The network generally splits into two branches or "heads":

  1. Semantic Head: This branch predicts a class label for every pixel, generating a dense map of the "stuff" in the scene.
  2. Instance Head: Simultaneously, this branch uses techniques similar to object detection to localize "things" and generate masks for them.

A fusion module or post-processing step then resolves conflicts between these outputs—for example, deciding if a pixel belongs to a "person" instance or the "background" wall behind them—to produce a final, non-overlapping panoptic segmentation map.

실제 애플리케이션

The holistic nature of panoptic segmentation makes it indispensable for industries where safety and context are paramount.

  • Autonomous Vehicles: Self-driving cars rely on panoptic perception to navigate safely. The semantic component identifies drivable surfaces (roads) and boundaries (sidewalks), while the instance component tracks dynamic obstacles like pedestrians and other vehicles. This unified view helps the vehicle's planning algorithms make safer decisions in complex traffic management scenarios.
  • 의료 영상 분석: 디지털 병리학에서 조직 샘플을 분석할 때는 일반적으로 조직 구조(물질)를 분할하는 동시에 특정 세포 유형이나 종양(대상)을 계수하고 측정해야 합니다. 이러한 세부적인 분석은 의사가 질병을 정확하게 정량화하고 진단하는 데 도움을 줍니다.
  • 로봇공학: 서비스 로봇 가정이나 창고와 같은 비정형 환경에서 작동하는 서비스 로봇은 이동 가능한 바닥(배경)과 조작하거나 회피해야 하는 물체(인스턴스)를 구분할 수 있어야 합니다.

Ultralytics 활용한 세분화 구현

While full panoptic training can be complex, developers can achieve high-precision instance segmentation—a critical component of the panoptic puzzle—using Ultralytics YOLO26. This state-of-the-art model offers real-time performance and is optimized for edge deployment.

다음 Python 사전 훈련된 분할 모델을 로드하고 추론을 실행하여 서로 다른 객체를 분리하는 방법을 보여줍니다:

from ultralytics import YOLO

# Load the YOLO26 segmentation model
model = YOLO("yolo26n-seg.pt")

# Run inference on an image to segment individual instances
# The model identifies 'things' and generates pixel-perfect masks
results = model("https://ultralytics.com/images/bus.jpg")

# Display the resulting image with overlaid segmentation masks
results[0].show()

훈련 데이터를 관리하고 주석 작업을 자동화하려는 팀을 위해 Ultralytics 데이터셋 관리 및 모델 훈련을 위한 도구 모음을 제공합니다. 분할 작업에는 고품질 데이터 주석이 필수적입니다. 모델이 효과적으로 학습하려면 정확한 픽셀 단위 레이블이 필요하기 때문입니다.

관련 용어 구분하기

Understanding the nuances between segmentation types is vital for selecting the right model for your project:

  • Semantic Segmentation: Focuses only on classifying pixels into categories. It answers "what class is this pixel?" (e.g., tree, sky) but cannot separate individual objects of the same class. If two cars are overlapping, they appear as one large "car" blob.
  • Instance Segmentation: Focuses only on detecting and masking countable objects. It answers "which object is this?" but usually ignores the background context entirely.
  • 파노프틱 분할: 두 가지를 결합합니다. 전체 이미지에 대해 "이 픽셀은 무엇인가?"와 "어느 객체 인스턴스에 속하는가?"라는 질문에 답하며, 분류되지 않은 픽셀이 없도록 보장합니다.

For further exploration of dataset formats used in these tasks, you can review the COCO dataset documentation, which is a standard benchmark for measuring segmentation performance.

Ultralytics 커뮤니티 가입

AI의 미래에 동참하세요. 글로벌 혁신가들과 연결하고, 협력하고, 성장하세요.

지금 참여하기