Learn how exporting Ultralytics YOLO11 to the OpenVINO™ format enables lightning-fast inference on Intel® hardware, enhancing speed, scalability, and accuracy.

Learn how exporting Ultralytics YOLO11 to the OpenVINO™ format enables lightning-fast inference on Intel® hardware, enhancing speed, scalability, and accuracy.
AI adoption depends on AI solutions being accessible, and a huge part of that is making them easy to deploy on the hardware people already have. Running AI models on GPUs (graphics processing units) is a great option in terms of performance and parallel processing power.
However, the reality is that not everyone has access to high-end GPUs, especially in edge environments or on everyday laptops. That’s why it’s so important to optimize models to run efficiently on more widely available hardware like central processing units (CPUs), integrated GPUs, and neural processing units (NPUs).
Computer vision, for example, is a branch of AI that enables machines to analyze and understand images and video streams in real time. Vision AI models like Ultralytics YOLO11 support key tasks such as object detection and instance segmentation that power applications from retail analytics to medical diagnostics.
To make computer vision more broadly accessible, Ultralytics has released an updated integration with the OpenVINO toolkit, which is an open-source project for optimizing and running AI inference across CPUs, GPUs, and NPUs.
With this integration, it’s easier to export and deploy YOLO11 models with up to 3× faster inference on CPUs and accelerated performance on Intel GPUs and NPUs. In this article, we’ll walk through how to use the Ultralytics Python package to export YOLO11 models to the OpenVINO format and use it for inference. Let’s get started!
Before we dive into the details of the OpenVINO integration supported by Ultralytics, let’s take a closer look at what makes YOLO11 a reliable and impactful computer vision model. YOLO11 is the latest model in the Ultralytics YOLO series, offering significant enhancements in both speed and accuracy.
One of its key highlights is efficiency. For instance, Ultralytics YOLO11m has 22% fewer parameters than Ultralytics YOLOv8m, yet it achieves higher mean average precision (mAP) on the COCO dataset. This means it runs faster and also detects objects more accurately, making it ideal for real-time applications where performance and responsiveness are critical.
Beyond object detection, YOLO11 supports various advanced computer vision tasks such as instance segmentation, pose estimation, image classification, object tracking, and oriented bounding box detection. YOLO11 is also developer-friendly, with the Ultralytics Python package providing a simple and consistent interface for training, evaluating, and deploying models.
In addition to this, the Ultralytics Python package supports various integrations and multiple export formats, including OpenVINO, ONNX, TorchScript, allowing you to easily integrate YOLO11 into various deployment pipelines. Whether you're targeting cloud infrastructure, edge devices, or embedded systems, the export process is straightforward and adaptable to your hardware needs.
OpenVINO™ (Open Visual Inference and Neural Network Optimization) is an open-source toolkit for optimizing and deploying AI inference across a wide range of hardware. It enables developers to run high-performance inference applications efficiently across various Intel platforms, including CPUs, integrated and discrete GPUs, NPUs, and field-programmable gate arrays (FPGAs).
OpenVINO provides a unified runtime interface that abstracts hardware differences through device-specific plugins. This means developers can write code once and deploy across multiple Intel hardware targets using a consistent API.
Here are some of the key features that make OpenVINO a great choice for deployment:
Now that we’ve explored what OpenVINO is and its significance, let’s discuss how to export YOLO11 models to the OpenVINO format and run efficient inference on Intel hardware.
To export a model to the OpenVINO format, you’ll first need to install the Ultralytics Python package. This package provides everything you need to train, evaluate, and export YOLO models, including YOLO11.
You can install it by running the command "pip install ultralytics" in your terminal or command prompt. If you're working in an interactive environment like Jupyter Notebook or Google Colab, just add an exclamation mark before the command.
Also, if you run into any issues during installation or while exporting, the Ultralytics documentation and troubleshooting guides are great resources to help you get back on track.
Once the Ultralytics package is set up, the next step is to load your YOLO11 model and convert it into a format compatible with OpenVINO.
In the example below, we’re using a pre-trained YOLO11 model (“yolo11n.pt”). The export functionality is used to convert it into OpenVINO format. After running this code, the converted model will be saved in a new directory named “yolo11n_openvino_model”.
from ultralytics import YOLO
model = YOLO("yolo11n.pt")
model.export(format="openvino")
Once your YOLO11 model is exported to OpenVINO format, you can run inferences in two ways: using the Ultralytics Python package or the native OpenVINO Runtime.
The exported YOLO11 model can be easily deployed using the Ultralytics Python package, as shown in the code snippet below. This method is ideal for quick experimentation and streamlined deployment on Intel hardware.
You can also specify which device to use for inference, such as "intel:cpu", "intel:gpu", or "intel:npu", depending on the Intel hardware available on your system.
ov_model = YOLO("yolo11n_openvino_model/")
results = ov_model("https://ultralytics.com/images/bus.jpg", device="intel:gpu")
After running the above code, the output image will be saved in the "runs/detect/predict" directory.
If you are looking for a customizable way to run inference, especially in production environments, the OpenVINO Runtime gives you more control over how your model is executed. It supports advanced features such as asynchronous execution (running multiple inference requests in parallel) and load balancing (distributing inference workloads efficiently across Intel hardware).
To use the native runtime, you’ll need the exported model files: a .xml file (which defines the network architecture) and a .bin file (which stores the model’s trained weights). You can also configure additional parameters like input dimensions or preprocessing steps depending on your application.
A typical deployment flow includes initializing the OpenVINO core, loading and compiling the model for a target device, preparing the input, and executing inference. For detailed examples and step-by-step guidance, refer to the official Ultralytics OpenVINO documentation.
While exploring Ultralytics integrations, you’ll notice that the Ultralytics Python package supports exporting YOLO11 models to a variety of formats such as TorchScript, CoreML, TensorRT, and ONNX. So, why choose the OpenVINO integration?
Here are some reasons why the OpenVINO export format is a great fit for deploying models on Intel hardware:
You can also evaluate the performance benchmarks for the YOLO11 model across a range of Intel® platforms on the OpenVINO™ Model Hub. The OpenVINO Model Hub is a resource for developers to evaluate AI models on Intel hardware and discover the performance advantage of OpenVINO across Intel CPUs, built-in GPUs, NPUs, and discrete graphics.
With the help of the OpenVINO integration, deploying YOLO11 models across Intel hardware in real-world situations becomes much simpler.
A great example is smart retail, where YOLO11 can help detect empty shelves in real time, track which products are running low, and analyze how customers move through the store. This enables retailers to improve inventory management and optimize store layouts for better shopper engagement.
Similarly, in smart cities, YOLO11 can be used to monitor traffic by counting vehicles, tracking pedestrians, and detecting red-light violations in real time. These insights can support traffic flow optimization, improve road safety, and assist in automated enforcement systems.
Another interesting use case is industrial inspection, where YOLO11 can be deployed on production lines to automatically detect visual defects such as missing components, misalignment, or surface damage. This boosts efficiency, lowers costs, and supports better product quality.
While deploying YOLO11 models with OpenVINO, here are a few important things to keep in mind to get the best results:
Exporting Ultralytics YOLO11 to the OpenVINO format makes it easy to run fast, efficient Vision AI models on Intel hardware. You can deploy across CPUs, GPUs, and NPUs without needing to retrain or change your code. It’s a great way to boost performance while keeping things simple and scalable.
With support built into the Ultralytics Python package, exporting and running inference with OpenVINO is straightforward. In just a few steps, you can optimize your model and run it across a variety of Intel platforms. Whether you're working on smart retail, traffic monitoring, or industrial inspection, this workflow helps you move from development to deployment with speed and confidence.
Join the YOLO community and check out the Ultralytics GitHub repository to learn more about impactful integrations supported by Ultralytics. Also, take a look at the Ultralytics licensing options to get started with computer vision today!
Register for our upcoming webinar to see the Ultralytics × OpenVINO integration in action, and visit the OpenVINO website to explore tools for optimizing and deploying AI at scale.