Yolo فيجن شنتشن
شنتشن
انضم الآن
مسرد المصطلحات

مربع الإحاطة

تعرّف على كيفية تمكين المربعات المحيطة لأنظمة الكشف عن الأجسام والذكاء الاصطناعي والتعلم الآلي. اكتشف دورها في تطبيقات رؤية الحاسوب!

A bounding box is a rectangular region defined by a set of coordinates that encloses a specific object within an image or video frame. In the field of computer vision (CV), these boxes serve as the fundamental annotations for teaching artificial intelligence (AI) systems how to locate and recognize distinct items. Rather than simply classifying an entire image as "containing a car," a bounding box allows a model to pinpoint the exact location and spatial extent of the car, separating it from the background and other entities. This localization capability is essential for object detection tasks, where the goal is to identify multiple objects simultaneously with high precision.

Core Concepts and Coordinates

To process visual data effectively, machine learning (ML) models rely on specific coordinate systems to represent bounding boxes mathematically. The chosen format often dictates how data is prepared for model training and how the model outputs its predictions.

  • XYXY Coordinates: This format defines a box using the absolute pixel values of the top-left corner and the bottom-right corner. It is intuitive for visualization tools like OpenCV or Matplotlib when drawing rectangles directly onto images.
  • XYWH Format: Common in datasets like COCO, this method specifies the center point of the object followed by the width and height of the box. This representation is critical for calculating loss functions during the learning process.
  • Normalized Coordinates: To ensure scalability across images of different resolutions, coordinates are often scaled to a range between 0 and 1. This helps models generalize better when analyzing inputs of varying dimensions.

تطبيقات واقعية

Bounding boxes are the building blocks for countless AI solutions across diverse industries. By enabling precise localization, they allow systems to interact intelligently with the physical world.

  • Autonomous Vehicles: Self-driving cars use bounding boxes to detect and track pedestrians, other vehicles, traffic signs, and obstacles in real-time. This spatial awareness is crucial for navigation and safety systems to make split-second decisions.
  • Retail Analytics: In smart stores, bounding boxes help monitor inventory on shelves and track customer interactions with products. This data can automate stock replenishment and provide insights into shopper behavior without manual counting.

Bounding Boxes in Action

When using modern architectures like يولو26, the model predicts bounding boxes along with a class label and a درجة الثقة. The following example demonstrates how to run inference on an image and access the bounding box coordinates using the ultralytics الحزمة.

from ultralytics import YOLO

# Load the YOLO26 model
model = YOLO("yolo26n.pt")

# Run inference on an image
results = model("https://ultralytics.com/images/bus.jpg")

# Access bounding box coordinates (xyxy format) for the first detected object
boxes = results[0].boxes
print(boxes.xyxy[0])  # Output: tensor([x1, y1, x2, y2, ...])

Related Terms and Differentiation

While bounding boxes are standard for general detection, they are distinct from other annotation types used in more granular tasks.

  • Instance Segmentation: Unlike a rectangular bounding box, segmentation creates a pixel-perfect mask that traces the exact outline of an object. This is useful when the precise shape is more important than the general location.
  • Oriented Bounding Box (OBB): Standard bounding boxes are axis-aligned (upright rectangles). OBBs can rotate to fit objects that are angled, such as ships in satellite imagery or packages on a conveyor belt, providing a tighter fit and reducing background noise.
  • Keypoints: Instead of enclosing an object, keypoints identify specific landmarks, such as joints on a human body for pose estimation.

Tools for Annotation and Management

Creating high-quality bounding box annotations is a critical step in the ML pipeline. The Ultralytics Platform simplifies this process by offering tools for data annotation and dataset management. Proper annotation ensures that models learn to distinguish objects accurately, minimizing errors such as overfitting or background confusion. Advanced techniques like Non-Maximum Suppression (NMS) are used during inference to refine these predictions by removing overlapping boxes, ensuring that only the most accurate detection remains for each object.

انضم إلى مجتمع Ultralytics

انضم إلى مستقبل الذكاء الاصطناعي. تواصل وتعاون وانمو مع المبتكرين العالميين

انضم الآن