Yolo Tầm nhìn Thâm Quyến
Thâm Quyến
Tham gia ngay
Bảng chú giải thuật ngữ

Theo Dõi Đối Tượng

Learn how object tracking works in computer vision. Discover how to use [YOLO26](https://docs.ultralytics.com/models/yolo26/) to monitor movement, assign unique IDs, and gain real-time insights for AI projects.

Object tracking is a dynamic process in computer vision (CV) that involves identifying specific entities in a video and monitoring their movement across a sequence of frames. Unlike static image analysis, which treats each snapshot in isolation, tracking introduces the dimension of time. This allows artificial intelligence (AI) systems to assign a unique identification number (ID) to each detected item—such as a car, a person, or an animal—and maintain that identity as the object moves, changes orientation, or is temporarily obscured. This capability is the cornerstone of advanced video understanding, enabling machines to analyze behavior, calculate trajectories, and derive actionable insights from raw footage.

Theo dõi đối tượng hoạt động như thế nào

Modern tracking systems generally utilize a "tracking-by-detection" paradigm. This workflow combines powerful detection models with specialized algorithms to associate detections over time. The process typically follows three main stages:

  1. Detection: In every frame, an object detection model, such as the state-of-the-art YOLO26, scans the image to locate objects of interest. The model outputs bounding boxes that define the spatial extent of each object.
  2. Motion Prediction: Algorithms like the Kalman Filter estimate the future position of an object based on its current velocity and trajectory. This prediction reduces the search space for the next frame, making the system more efficient.
  3. Data Association: The system matches new detections to existing tracks using optimization methods like the Hungarian algorithm. This step often relies on metrics like Intersection over Union (IoU) to measure how much a predicted box overlaps with a new detection. Advanced trackers may also use visual feature extraction to re-identify objects that look similar.

So sánh theo dõi đối tượng với phát hiện đối tượng

While these terms are closely related, they serve distinct functions within the machine learning (ML) pipeline.

  • Object Detection answers the question, "What is present in this image and where?" It is stateless, meaning it has no memory of previous frames. If a car drives through a video, a detector sees a "car" in frame 1 and a "car" in frame 2, but does not know they are the same vehicle.
  • Object Tracking answers the question, "Where is this specific object going?" It is stateful. It connects the "car" in frame 1 to the "car" in frame 2, allowing the system to log that "Car ID #42" is moving left to right. This is essential for tasks like predictive modeling and counting.

Các Ứng dụng Thực tế

The ability to maintain object identity enables complex real-time inference applications across various industries.

  • Intelligent Transportation Systems: Tracking is vital for autonomous vehicles to navigate safely. By tracking pedestrians and other vehicles, cars can predict potential collisions. Furthermore, traffic engineers use these systems for speed estimation to enforce safety regulations and optimize traffic flow.
  • Retail Analytics: Brick-and-mortar stores use AI in retail to understand customer behavior. Tracking allows store managers to perform object counting to measure foot traffic, analyze dwell times in front of displays using heatmaps, and optimize queue management to reduce wait times.
  • Sports Analysis: In professional sports, coaches use tracking combined with pose estimation to analyze player biomechanics and team formations. This data provides a competitive edge by revealing patterns that are invisible to the naked eye.

Triển khai theo dõi với Python

Ultralytics makes it simple to implement high-performance tracking. The track mode in the library automatically handles detection, motion prediction, and ID assignment. The example below shows how to use the Ultralytics Nền tảng compatible YOLO26 model to track objects in a video.

from ultralytics import YOLO

# Load the official YOLO26n model (nano version for speed)
model = YOLO("yolo26n.pt")

# Track objects in a video file or webcam (source=0)
# 'show=True' displays the video with bounding boxes and unique IDs
results = model.track(source="https://ultralytics.com/images/bus.jpg", show=True)

# Access the unique tracking IDs from the results
if results[0].boxes.id is not None:
    print(f"Detected Track IDs: {results[0].boxes.id.cpu().numpy()}")

Các Khái Niệm Liên Quan

To fully understand the ecosystem of tracking, it is helpful to explore instance segmentation, which tracks the precise pixel-level contours of an object rather than just a box. Additionally, Multi-Object Tracking (MOT) challenges often involve widely used benchmarks like MOTChallenge to evaluate how well algorithms handle crowded scenes and occlusions. For deployment in production environments, developers often utilize tools like NVIDIA DeepStream or OpenCV to integrate these models into efficient pipelines.

Tham gia Ultralytics cộng đồng

Tham gia vào tương lai của AI. Kết nối, hợp tác và phát triển cùng với những nhà đổi mới toàn cầu

Tham gia ngay