Yolo 비전 선전
선전
지금 참여하기
용어집

시맨틱 분할

이미지의 모든 픽셀을 분류하여 정확한 장면 이해를 돕는 시맨틱 classify 강력한 기능을 알아보세요. 지금 애플리케이션과 도구를 살펴보세요!

Semantic segmentation is a computer vision task that involves dividing an image into distinct regions by assigning a specific class label to every individual pixel. Unlike simpler tasks like image classification, which assigns a single label to an entire image, or object detection, which draws bounding boxes around objects, semantic segmentation provides a pixel-level understanding of the scene. This granular analysis is crucial for applications where the precise shape and boundary of an object are just as important as its identity. It allows machines to "see" the world more like humans do, distinguishing the exact pixels that make up a road, a pedestrian, or a tumor within a medical scan.

How Semantic Segmentation Works

At its core, semantic segmentation treats an image as a grid of pixels that need to be classified. Deep learning models, particularly Convolutional Neural Networks (CNNs), are the standard architecture for this task. A typical architecture, such as the widely used U-Net, employs an encoder-decoder structure. The encoder compresses the input image to extract high-level features (like textures and shapes), while the decoder upsamples these features back to the original image resolution to generate a precise segmentation mask.

To achieve this, models are trained on large annotated datasets where human annotators have carefully colored each pixel according to its class. Tools like the Ultralytics Platform facilitate this process by offering auto-annotation features that speed up the creation of high-quality ground truth data. Once trained, the model outputs a mask where every pixel value corresponds to a class ID, effectively "painting" the image with meaning.

관련 개념 구분하기

It is common to confuse semantic segmentation with other pixel-level tasks. Understanding the differences is key to selecting the right approach for a project:

  • Instance Segmentation: While semantic segmentation treats all objects of the same class as a single entity (e.g., all "cars" are colored blue), instance segmentation distinguishes between individual objects (e.g., "Car A" is blue, "Car B" is red).
  • Panoptic Segmentation: This combines both concepts. It assigns a class to every pixel (semantic) while also separating individual instances of countable objects (instance), providing the most comprehensive scene understanding.

실제 애플리케이션

The ability to parse visual data with pixel-perfect accuracy drives innovation across many high-stakes industries:

  • AI in Automotive: Autonomous vehicles rely heavily on segmentation to navigate safely. By identifying drivable areas versus sidewalks, and precisely outlining pedestrians, cars, and obstacles, self-driving systems can make critical decisions in real-time.
  • AI in Healthcare: In medical imaging, models segment organs, lesions, or tumors from CT scans and MRIs. This assists radiologists in calculating tumor volume for treatment planning or guiding robotic surgery tools with extreme precision.
  • AI in Agriculture: Farmers use aerial drone imagery and segmentation to monitor crop health. By classifying pixels as "healthy crop," "weed," or "soil," automated systems can target herbicide spraying, reducing chemical usage and optimizing yield.

Ultralytics 활용한 세분화 구현

Modern segmentation models need to balance accuracy with speed, especially for 실시간 추론 on edge devices. The Ultralytics YOLO26 model family includes specialized segmentation models (denoted with a -seg suffix) that are natively end-to-end, offering superior performance over older architectures like YOLO11.

다음 예제는 이미지를 사용하여 분할을 수행하는 방법을 보여줍니다. ultralytics Python package. This produces binary masks that delineate object boundaries.

from ultralytics import YOLO

# Load a pre-trained YOLO26 segmentation model
model = YOLO("yolo26n-seg.pt")

# Run inference on an image
results = model("https://ultralytics.com/images/bus.jpg")

# Visualize the results
# This will display the image with the segmentation masks overlaid
results[0].show()

과제 및 향후 방향

Despite significant progress, semantic segmentation remains computationally intensive. Generating a classification for every single pixel requires substantial GPU resources and memory. Researchers are actively working on optimizing these models for efficiency, exploring techniques like model quantization to run heavy networks on mobile phones and embedded devices.

Furthermore, the need for massive labeled datasets is a bottleneck. To address this, the industry is moving toward synthetic data generation and self-supervised learning, allowing models to learn from raw images without requiring millions of manual pixel labels. As these technologies mature, we can expect segmentation to become even more ubiquitous in smart cameras, robotics, and augmented reality applications.

Ultralytics 커뮤니티 가입

AI의 미래에 동참하세요. 글로벌 혁신가들과 연결하고, 협력하고, 성장하세요.

지금 참여하기