Ultralytics YOLO26: The new standard for edge-first Vision AI
Learn how Ultralytics YOLO26 sets a new standard for edge-first Vision AI with end-to-end, NMS-free inference, faster CPU performance, and simplified production deployment.

Learn how Ultralytics YOLO26 sets a new standard for edge-first Vision AI with end-to-end, NMS-free inference, faster CPU performance, and simplified production deployment.

Today, Ultralytics officially launches YOLO26, the most advanced and deployable YOLO model to date. First announced at YOLO Vision 2025 (YV25), YOLO26 represents a fundamental shift in how computer vision models are trained, deployed, and scaled in real-world systems.
Vision AI is rapidly moving to the edge. Increasingly, images and video are processed directly on devices, cameras, robots, and embedded systems, where latency, reliability, and cost matter more than raw cloud compute. YOLO26 is designed for this reality, delivering world-leading performance while running efficiently on CPUs, edge accelerators, and low-power hardware.
While YOLO26 is a significant leap forward, it still maintains the familiar, streamlined Ultralytics YOLO experience developers rely on. It fits seamlessly into existing workflows, supports a wide range of vision tasks, and remains easy to use, making adoption straightforward for both research and production teams.

In this article, we’ll break down everything you need to know about Ultralytics YOLO26 and what a lighter, smaller, and faster YOLO model means for the future of vision AI. Let’s get started!
Ultralytics YOLO26 is built around the idea that impactful vision AI capabilities should be easy to access for everyone. We believe that powerful computer vision tools shouldn’t be locked away or limited to a small group of organizations.
At YV25 in London, our Founder & CEO Glenn Jocher shared his thoughts on this vision, saying, “The most amazing AI technology is behind closed doors. It’s not open. Large companies control new developments, and everyone else has to wait in line for access. We have a different vision at Ultralytics. We want AI to be in everybody’s hands.”
He also explained that this means bringing AI out of the cloud and into real-world environments, adding, “We want the technology to not just hang out in the cloud, but to be drawn down into edge devices, into your phone, your vehicles, and low-power systems. And we want these amazing people who are creating solutions to have access to that.”
YOLO26 reflects this vision in practice: a model designed to run where vision AI is actually deployed, not where it’s easiest to prototype.
Like earlier Ultralytics YOLO models, YOLO26 supports multiple computer vision tasks within a single, unified model family. It is available in five sizes, Nano (n), Small (s), Medium (m), Large (l), and Extra Large (x), allowing teams to balance speed, accuracy, and model size depending on deployment constraints.
Beyond flexibility, YOLO26 raises the performance bar. Compared to YOLO11, the YOLO26 nano model delivers up to 43% faster CPU inference, making it one of the fastest high-accuracy object detection models available for edge and CPU-based deployment.
.webp)
Here’s a closer look at the computer vision tasks supported by YOLO26:
All tasks support training, validation, inference, and export within a consistent framework.
Ultralytics YOLO26 introduces several core innovations that improve inference speed, training stability, and deployment simplicity. Here’s an overview of these innovations:
Next, let’s walk through these next-generation features that make YOLO26 faster, more efficient, and easier to deploy in detail.
Earlier YOLO models used Distribution Focal Loss (DFL) during training to improve bounding box precision. While effective, DFL introduced additional complexity and imposed fixed regression limits that made export and deployment more challenging, particularly on edge and low-power hardware.
YOLO26 removes DFL entirely. Removing DFL eliminates fixed bounding box regression limits present in earlier models, improving reliability and accuracy when detecting very large objects.
By simplifying the bounding box prediction process, YOLO26 becomes easier to export and runs more reliably across a wide range of edge and low-power devices.
Traditional object detection pipelines rely on Non-Maximum Suppression (NMS) as a post-processing step to filter overlapping predictions. While effective, NMS adds latency, complexity, and fragility, especially when deploying models across multiple runtimes and hardware targets.
YOLO26 introduces a native end-to-end inference mode, where the model directly outputs final predictions without requiring NMS as a separate post-processing step. Duplicate predictions are handled inside the network itself.
Eliminating NMS reduces latency, simplifies deployment pipelines, and lowers the risk of integration errors, making YOLO26 particularly well-suited for real-time and edge deployments.
A crucial feature related to training is the introduction of Progressive Loss Balancing (ProgLoss) and Small-Target-Aware Label Assignment (STAL). These improved loss functions help stabilize training and improve detection accuracy.
ProgLoss helps the model learn more consistently during training, reducing instability and allowing it to converge more smoothly. Meanwhile, STAL focuses on improving how the model learns from small objects, which are often harder to detect due to limited visual detail.
Together, ProgLoss and STAL lead to more reliable detections, with noticeable improvements in small-object recognition. This is especially important for edge applications such as the Internet of Things (IoT), robotics, and aerial imagery, where objects are often small, distant, or partially visible.
With YOLO26, we adopted a new optimizer called MuSGD, designed to make training more stable and efficient. MuSGD is a hybrid approach that combines the strengths of traditional Stochastic Gradient Descent (SGD) with techniques inspired by Muon, an optimizer used in large language model (LLM) training.
SGD has been a reliable choice in computer vision for a long time, thanks to its simplicity and strong generalization. At the same time, recent advances in LLM training have shown that newer optimization methods can improve stability and speed when applied carefully. MuSGD brings some of these ideas into the computer vision space.
Inspired by Moonshot AI’s Kimi K2, MuSGD incorporates optimization strategies that help the model converge more smoothly during training. This makes it possible for YOLO26 to reach strong performance faster while reducing training instability, especially in larger or more complex training setups.
MuSGD helps YOLO26 train more predictably across model sizes, contributing to both performance gains and training stability.
As vision AI continues to move closer to where data is generated, strong edge performance is becoming increasingly crucial. Specifically optimized for edge computing, YOLO26 delivers up to 43% faster CPU inference, ensuring real-time performance on devices without GPUs. This improvement enables responsive, reliable vision systems to run directly on cameras, robots, and embedded hardware, where latency, efficiency, and cost constraints define what’s possible.
Beyond architectural improvements that make object detection more accurate, YOLO26 also includes task-specific optimizations designed to improve performance across computer vision tasks. For instance, it enhances instance segmentation, pose estimation, and oriented bounding box detection with targeted updates that improve accuracy and reliability.`
Here’s an overview of these optimizations:

Ultralytics is also introducing YOLOE-26, a new family of open-vocabulary segmentation models built on YOLO26’s architecture and training innovations.
YOLOE-26 is not a new task or feature, but a specialized model family that reuses the existing segmentation task while enabling text prompts, visual prompts, and prompt-free inference. Available across all standard YOLO sizes, YOLOE-26 delivers stronger accuracy and more reliable real-world performance than earlier open-vocabulary segmentation models.
From vision-driven cameras to robots powered by computer vision and tiny processing chips at the edge, computer vision and AI are being deployed directly on-device for real-time inference. Ultralytics YOLO26 is built specifically for these environments, where low latency, efficiency, and reliable performance are vital.
In action, this means YOLO26 can be deployed across a wide range of hardware easily. Specifically, through the Ultralytics Python package and its wide range of integrations, models can be exported into formats optimized for different platforms and hardware accelerators.
For example, exporting to TensorRT enables high-performance inference on NVIDIA GPUs, while CoreML supports native deployment on Apple devices, and OpenVINO optimizes performance on Intel hardware. YOLO26 can also be exported to run on multiple dedicated edge accelerators, enabling high-throughput, energy-efficient inference on specialized Edge AI hardware.
These are just a few examples, with many more integrations supported across edge and production environments. This flexibility enables a single YOLO26 model to run across diverse deployment targets. It streamlines production workflows and brings Vision AI closer to the edge.
Designed for real-world deployment, YOLO26 can be used across a wide range of computer vision use cases in different industries. Here are some examples of where it can be applied:
Smart cities: Across urban environments, YOLO26 can analyze video streams from traffic and public-space cameras. This enables applications like traffic monitoring, public safety, and infrastructure management at the edge.

Ultralytics YOLO26 can be used through two complementary workflows, depending on how you build and deploy vision AI.
Option 1: Use Ultralytics YOLO26 through the Ultralytics Platform (recommended)
The Ultralytics Platform provides a centralized way to train, deploy, and monitor YOLO26 models in production. It brings datasets, experiments, and deployments together in one place, making it easier to manage vision AI workflows at scale, especially for teams deploying to edge and production environments.
Through the Platform, users can:
👉 Explore YOLO26 on the Ultralytics Platform: platform.ultralytics.com/ultralytics/yolo26
Option 2: Use Ultralytics YOLO26 via open-source workflows
YOLO26 remains fully accessible through Ultralytics’ open-source ecosystem and can be used with existing Python-based workflows for training, inference, and export.
Developers can install the Ultralytics package, load pretrained YOLO26 models, and deploy them using familiar tools and formats such as ONNX, TensorRT, CoreML, or OpenVINO.
pip install ultralyticsfrom ultralytics import YOLO
# Load a COCO-pretrained YOLO26n model
model = YOLO("yolo26n.pt")
# Run inference with the YOLO26n model on the 'bus.jpg' image
results = model("path/to/bus.jpg")For users who prefer hands-on control or custom pipelines, full documentation and guides are available in the Ultralytics docs.
Ultralytics YOLO26 is designed to meet the needs of tomorrow's vision AI solutions, where models will have to be fast, efficient, and easy to deploy on real hardware. By improving performance, simplifying deployment, and expanding what the model can do, YOLO26 fits naturally into a wide range of real-world applications. YOLO26 sets a new baseline for how vision AI is built, deployed, and scaled. We’re excited to see how the community uses it to ship real-world computer vision systems.
Join our growing community and explore our GitHub repository for hands-on AI resources. To build with Vision AI today, explore our licensing options. Learn how AI in agriculture is transforming farming and how Vision AI in robotics is shaping the future by visiting our solutions pages.