Explore Multi-Object Tracking (MOT): track and re-identify objects across video frames with YOLO11, Kalman Filters, appearance matching and modern data-association.
Multi-Object Tracking (MOT) is a sophisticated capability in computer vision (CV) that enables systems to detect, identify, and follow multiple unique entities across a sequence of video frames. Unlike standard object detection, which treats every image frame as an isolated event, MOT introduces a temporal dimension to artificial intelligence (AI). By assigning a persistent identification number (ID) to each detected instance—such as a specific car in traffic or a player on a sports field—MOT allows algorithms to maintain the identity of objects as they move, interact, and even temporarily disappear behind obstructions. This continuity is the foundation of modern video understanding and behavioral analysis.
Most contemporary MOT systems, including those powered by the state-of-the-art YOLO26, operate on a "tracking-by-detection" paradigm. This workflow relies on a cycle of detection and association to ensure high accuracy and minimal ID switching.
Understanding the distinction between MOT and similar machine learning (ML) terms is crucial for selecting the right tool.
The ability to turn video feeds into structured data drives innovation across industries, enabling predictive modeling and automated decision-making.
The ultralytics package provides a seamless interface for MOT, integrating powerful algorithms like
BoT-SORT and
ByteTrack. The following example demonstrates how to load a model and track objects in a video stream.
from ultralytics import YOLO
# Load a pre-trained YOLO model (YOLO11n is used here, YOLO26n is also supported)
model = YOLO("yolo11n.pt")
# Perform tracking on a video source
# 'persist=True' ensures tracks are maintained between frames
results = model.track(source="https://youtu.be/LNwODJXcvt4", persist=True, tracker="bytetrack.yaml")
# Visualize the first frame's results with IDs drawn
results[0].show()
This simple workflow handles detection, association, and ID assignment automatically, allowing developers to focus on higher-level logic like region counting or behavioral triggers. For more details on configuration, refer to the tracking mode documentation.