تعرّف على كيفية تمكين المربعات المحيطة لأنظمة الكشف عن الأجسام والذكاء الاصطناعي والتعلم الآلي. اكتشف دورها في تطبيقات رؤية الحاسوب!
A bounding box is a rectangular region defined by a set of coordinates that encloses a specific object within an image or video frame. In the field of computer vision (CV), these boxes serve as the fundamental annotations for teaching artificial intelligence (AI) systems how to locate and recognize distinct items. Rather than simply classifying an entire image as "containing a car," a bounding box allows a model to pinpoint the exact location and spatial extent of the car, separating it from the background and other entities. This localization capability is essential for object detection tasks, where the goal is to identify multiple objects simultaneously with high precision.
To process visual data effectively, machine learning (ML) models rely on specific coordinate systems to represent bounding boxes mathematically. The chosen format often dictates how data is prepared for model training and how the model outputs its predictions.
Bounding boxes are the building blocks for countless AI solutions across diverse industries. By enabling precise localization, they allow systems to interact intelligently with the physical world.
When using modern architectures like يولو26, the model
predicts bounding boxes along with a class label and a
درجة الثقة. The following example demonstrates how
to run inference on an image and access the bounding box coordinates using the ultralytics الحزمة.
from ultralytics import YOLO
# Load the YOLO26 model
model = YOLO("yolo26n.pt")
# Run inference on an image
results = model("https://ultralytics.com/images/bus.jpg")
# Access bounding box coordinates (xyxy format) for the first detected object
boxes = results[0].boxes
print(boxes.xyxy[0]) # Output: tensor([x1, y1, x2, y2, ...])
While bounding boxes are standard for general detection, they are distinct from other annotation types used in more granular tasks.
Creating high-quality bounding box annotations is a critical step in the ML pipeline. The Ultralytics Platform simplifies this process by offering tools for data annotation and dataset management. Proper annotation ensures that models learn to distinguish objects accurately, minimizing errors such as overfitting or background confusion. Advanced techniques like Non-Maximum Suppression (NMS) are used during inference to refine these predictions by removing overlapping boxes, ensuring that only the most accurate detection remains for each object.