Karmaşık bilgisayarlı görü görevlerinde hassas nesne algılama için doğruluğa odaklı çözümler olan iki aşamalı nesne algılayıcıların gücünü keşfedin.
Two-stage object detectors are a sophisticated class of deep learning (DL) architectures used in computer vision to identify and locate items within an image. Unlike their one-stage counterparts, which perform detection in a single pass, these models divide the task into two distinct phases: region proposal and object classification. This bifurcated approach was pioneered to prioritize high localization accuracy, making these detectors historically significant in the evolution of artificial intelligence (AI). By separating the "where" from the "what," two-stage detectors often achieve superior precision, particularly on small or occluded objects, though this typically comes at the cost of increased computational resources and slower inference latency.
The architecture of a two-stage detector relies on a sequential workflow that mimics how a human might carefully scrutinize a scene.
Prominent examples of this architecture include the R-CNN family, specifically Faster R-CNN and Mask R-CNN, which set the standard for academic benchmarks for several years.
It is helpful to distinguish two-stage models from one-stage object detectors like the Single Shot MultiBox Detector (SSD) and the Ultralytics YOLO series. While two-stage models prioritize accuracy by processing regions separately, one-stage models frame detection as a single regression problem, mapping image pixels directly to bounding box coordinates and class probabilities.
Historically, this created a trade-off: two-stage models were more accurate but slower, while one-stage models were faster but less precise. However, modern advancements have blurred this line. State-of-the-art models like YOLO26 now utilize end-to-end architectures that rival the accuracy of two-stage detectors while maintaining the speed necessary for real-time inference.
Because of their emphasis on precision and recall, two-stage detectors are often preferred in scenarios where safety and detail are more critical than raw processing speed.
While two-stage detectors established the foundation for high-accuracy vision, modern developers often utilize advanced one-stage models that offer comparable performance with significantly easier deployment workflows. The Ultralytics Platform simplifies the training and deployment of these models, managing datasets and compute resources efficiently.
The following Python example demonstrates how to load and run inference using a modern object detection workflow with
ultralytics, achieving high-accuracy results similar to traditional two-stage approaches but with greater
efficiency:
from ultralytics import YOLO
# Load the YOLO26 model, a modern high-accuracy detector
model = YOLO("yolo26n.pt")
# Run inference on an image to detect objects
results = model("https://ultralytics.com/images/bus.jpg")
# Process results (bounding boxes, classes, and confidence scores)
for result in results:
result.show() # Display the detection outcomes
print(result.boxes.conf) # Print confidence scores
