Découvrez comment ONNX améliore la portabilité et l'interopérabilité des modèles d'IA, permettant un déploiement transparent des modèlesYOLO d'Ultralytics sur diverses plateformes.
ONNX (Open Neural Network Exchange) is an open-source format designed to represent machine learning models, allowing interoperability between various AI frameworks and tools. It serves as a universal translator for deep learning, enabling developers to build models in one framework—such as PyTorch, TensorFlow, or Scikit-learn—and seamlessly deploy them in another environment optimized for inference. By defining a common set of operators and a standard file format, ONNX eliminates the need for complex, custom conversion scripts that were historically required to move models from research to production. This flexibility is crucial for modern AI workflows, where training might occur on powerful cloud GPUs while deployment targets diverse hardware like edge devices, mobile phones, or web browsers.
In the rapidly evolving landscape of artificial intelligence, researchers and engineers often use different tools for different stages of the development lifecycle. A data scientist might prefer the flexibility of PyTorch for experimentation and training, while a production engineer needs the optimized performance of TensorRT or OpenVINO for deployment. Without a standard exchange format, moving a model between these ecosystems is difficult and error-prone.
ONNX bridges this gap by providing a shared definition of the computation graph. When a model is exported to ONNX, it is serialized into a format that captures the network structure (layers, connections) and parameters (weights, biases) in a framework-agnostic way. This allows inference engines specifically tuned for hardware acceleration—such as the ONNX Runtime—to execute the model efficiently across multiple platforms, including Linux, Windows, macOS, Android, and iOS.
Adopting the Open Neural Network Exchange format offers several strategic advantages for AI projects:
.onnx file can be accelerated on NVIDIA GPUs, Intel CPUs, or
mobile NPUs (Neural Processing Units) using tools like
OpenVINO or CoreML.
The versatility of ONNX makes it a staple in diverse industries. Here are two concrete examples of its application:
Consider a mobile application designed for real-time crop health monitoring. The model might be trained on a powerful cloud server using a large dataset of plant images. However, the app needs to run offline on a farmer's smartphone. By exporting the trained model to ONNX, developers can integrate it into the mobile app using ONNX Runtime Mobile. This allows the phone's processor to run object detection locally, identifying pests or diseases instantly without needing an internet connection.
In e-commerce, a "virtual try-on" feature might use pose estimation to overlay clothing on a user's webcam feed. Training this model might happen in Python, but the deployment target is a web browser. Using ONNX, the model can be converted and run directly in the user's browser via ONNX Runtime Web. This utilizes the client's device capabilities (WebGL or WebAssembly) to perform computer vision tasks, ensuring a smooth, privacy-preserving experience since video data never leaves the user's computer.
It is helpful to distinguish ONNX from other model formats and tools:
The Ultralytics ecosystem simplifies the process of converting state-of-the-art models like YOLO26 into the ONNX format. The export functionality is built directly into the library, handling the complex graph traversal and operator mapping automatically.
The following example demonstrates how to export a pre-trained YOLO26 model to ONNX format for deployment:
from ultralytics import YOLO
# Load the YOLO26n model (Nano version recommended for edge deployment)
model = YOLO("yolo26n.pt")
# Export the model to ONNX format
# The 'dynamic' argument enables variable input sizes
path = model.export(format="onnx", dynamic=True)
print(f"Model exported successfully to: {path}")
Once exported, this .onnx file can be utilized in the
Plate-forme Ultralytics for management or deployed directly to edge
devices using the ONNX Runtime, making high-performance computer vision accessible in virtually any environment.