Yolo 비전 선전
선전
지금 참여하기
용어집

비최대 억제NMS

객체 감지를 위한 비최대 억제NMS에 대해 알아보세요. 이 기술이 어떻게 결과를 개선하고 정확도를 높이며 YOLO 같은 AI 애플리케이션을 지원하는지 알아보세요.

Non-Maximum Suppression (NMS) is a post-processing technique used in object detection to refine the raw predictions made by a model. When an object detection model analyzes an image, it often generates multiple overlapping bounding boxes for a single object, each with an associated confidence score. These redundant predictions occur because the model may detect the same feature at slightly different scales or positions. NMS filters this output by keeping only the most accurate bounding box for each object and discarding the others, ensuring that the final output is clean, precise, and free of duplicates.

Non-Maximum Suppression 작동 방식

The NMS algorithm operates on a list of candidate bounding boxes and their corresponding confidence scores. The goal is to select the best box for an object and suppress (remove) any other boxes that overlap significantly with it, as these are likely duplicate detections of the same object. The process typically follows these steps:

  1. Filtering: Eliminate all bounding boxes with confidence scores below a specific threshold (e.g., 0.25) to remove weak predictions immediately.
  2. Sorting: Sort the remaining boxes in descending order based on their confidence scores.
  3. Selection: Pick the box with the highest confidence score as a valid detection.
  4. Comparison: Compare this selected box with all other remaining boxes using Intersection over Union (IoU), a metric that measures the overlap between two boxes.
  5. Suppression: If the IoU between the selected box and another box exceeds a predefined threshold (e.g., 0.45), the lower-scoring box is considered a duplicate and is removed.
  6. Iteration: Repeat the process with the next highest-scoring box that has not yet been suppressed or selected, until all boxes are processed.

실제 애플리케이션

NMS is essential in scenarios where precision is paramount and duplicate detections can confuse downstream systems.

  • Autonomous Driving: In self-driving car systems, cameras detect pedestrians, other vehicles, and traffic signs. A model might predict three slightly different boxes for a single pedestrian. NMS ensures the vehicle's planning system receives only one coordinate for that pedestrian, preventing erratic braking or path planning errors caused by "ghost" obstacles.
  • Retail Inventory Management: When using computer vision to count products on a shelf, items are often packed closely together. Without NMS, a single soda can might be counted twice due to overlapping predictions, leading to inaccurate stock levels. NMS refines these detections to ensure the inventory count matches reality.

NMS Implementation with PyTorch

While many modern frameworks handle NMS internally, understanding the implementation helps in tuning parameters. The following example demonstrates how to apply NMS using the PyTorch library:

import torch
import torchvision.ops as ops

# Example bounding boxes: [x1, y1, x2, y2]
boxes = torch.tensor(
    [
        [100, 100, 200, 200],  # Box A
        [105, 105, 195, 195],  # Box B (High overlap with A)
        [300, 300, 400, 400],  # Box C (Distinct object)
    ],
    dtype=torch.float32,
)

# Confidence scores for each box
scores = torch.tensor([0.9, 0.8, 0.95], dtype=torch.float32)

# Apply NMS with an IoU threshold of 0.5
# Boxes with IoU > 0.5 relative to the highest scoring box are suppressed
keep_indices = ops.nms(boxes, scores, iou_threshold=0.5)

print(f"Indices to keep: {keep_indices.tolist()}")
# Output will likely be [2, 0] corresponding to Box C (0.95) and Box A (0.9),
# while Box B (0.8) is suppressed due to overlap with A.

NMS vs. End-to-End Detection

Traditionally, NMS has been a mandatory "clean-up" step that sits outside the main neural network, adding inference latency. However, the field is evolving toward end-to-end architectures.

  • Standard NMS: A heuristic process that requires manual tuning of the IoU threshold. If the threshold is too low, valid objects close to each other might be missed (low recall). If too high, duplicates remains (low precision).
  • End-to-End Models: Next-generation models like YOLO26 are designed to be natively end-to-end. They learn to predict exactly one box per object during training, effectively internalizing the NMS process. This eliminates the need for external post-processing, resulting in faster inference speeds and simpler deployment pipelines on the Ultralytics Platform.

관련 개념

  • Soft-NMS: A variation where overlapping boxes are not strictly removed but have their confidence scores reduced. This allows somewhat overlapping objects (like people in a crowd) to still be detected if their scores remain high enough after decay.
  • Anchor Boxes: Predefined box shapes used by many detectors to estimate object size. NMS is applied to the final predictions refined from these anchors.
  • Intersection over Union (IoU): The mathematical formula used by NMS to determine how much two boxes overlap, acting as the decision threshold for suppression.

Ultralytics 커뮤니티 가입

AI의 미래에 동참하세요. 글로벌 혁신가들과 연결하고, 협력하고, 성장하세요.

지금 참여하기