Meet YOLO26: next-gen vision AI.
Ultralytics
Ultralytics Deploy

Deploy computer vision models across 43 global regions

Your trained models from browser testing to production endpoints in a few clicks with auto-scaling, real-time monitoring, and 19 export formats.

Ultralytics deployment interface showing model deployment options.

Deploy at global production scale

Move trained models into production with worldwide availability, broad export support, and the usage volume proven by the Ultralytics ecosystem.

0+
Deployment regions
0+
Export formats
0+
Daily usages
Ultralytics deployment card for a trained computer vision model.

Deploy to 43 regions worldwide

Deploy your models to dedicated endpoints across the Americas, Europe, Asia-Pacific, and the Middle East. Each endpoint has its own URL, auto-scaling, and monitoring.

Autoscaling deployment view for Ultralytics inference endpoints.

Auto-scaling that matches your traffic

Dedicated endpoints scale up for traffic spikes and down to zero when idle.

  • Scale to zero by default: No cost when your endpoint isn't receiving requests.
  • No rate limits: Dedicated endpoints have no throughput caps.
  • Configurable resources: Choose CPU (1-8 cores) and memory (1-32 GB) to match your workload.
Model export format selector for ONNX, TensorRT, CoreML, TF Lite, and other formats.

19 export formats. Your model. Any environment.

Ultralytics Platform supports cloud and edge deployment for high-performance. All Ultralytics YOLO models are natively optimized to run efficiently across environments, delivering high accuracy, reliable performance, and compatibility even on edge devices with limited compute resources.

Ultralytics monitoring dashboard for deployed model inference.

Monitor everything in production

Complete real-time visibility into your models' performance. Once your models are live, the deployments dashboard gives you a centralized overview of every running endpoint, with the metrics and toolkit you need to optimize and keep your frameworks running reliably.

  • Request volume: Total requests across all endpoints over the last 24 hours.
  • P95 latency: 95th percentile response time to track real-world use case performance.
  • Error rates: Clear alerts when error rates exceed 5%, with severity-filtered logs to diagnose issues fast.
  • Health checks: Live endpoint monitoring with auto-retry. Latency displayed per check.
Python code snippet for sending an image to a deployment endpoint.

Integrate in minutes

Every deployed endpoint comes with auto-generated code examples in Python, JavaScript, and cURL, pre-populated with your actual endpoint URL and API key. Copy, paste, and start sending inference requests from any application.

Test your model in the browser

Every trained model includes a built-in Predict tab functionality. Upload an image, or open your camera; the bounding boxes appear instantly.

Inference runs the top model performance automatically when you upload an image or change a parameter.
Fine-tune confidence thresholds, IoU settings, and image size to see how they affect predictions in real time.
Detection, instance segmentation, classification, pose estimation, and OBB are clearly depicted for your task.
Cloud endpoint deployment tab in Ultralytics Platform.

Try YOLO26 Inference

Drag and drop an image to see real-time object detection

Learn how to deploy!

Watch how to test a trained model, deploy it to a global endpoint, and monitor performance.

The engine behind the platform. The standard in vision AI.

From Ultralytics YOLOv5 to the groundbreaking YOLO26, Ultralytics builds and maintains the most widely adopted vision AI models on earth, trusted by millions of developers and the world's most ambitious enterprises.

Detect objects in an image or video with axis-aligned bounding boxes.
Create pixel-precise masks highlighting objects within a frame.
Assign one or multiple labels to an entire image.
Detect and label body keypoints for body position monitoring.
Detect rotated objects with angle-aware bounding boxes.
Two warehouse workers wearing yellow safety helmets and reflective vests inspecting wooden pallets.

Explore industry solutions

See how teams apply Ultralytics computer vision across production environments.

Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more

Frequently asked questions

  • Yes. Each model can be deployed to multiple regions simultaneously. Your plan determines the total number of endpoints available: 3 for Free, 10 for Pro, and unlimited for Enterprise. This allows you to serve users globally with low-latency endpoints in each region.

  • Dedicated endpoints are billed based on CPU, memory, and request volume. With scale-to-zero enabled by default, you only pay for active inference time. There is no cost when your endpoint isn't receiving requests. Shared inference is included with your platform plan.

  • Shared inference runs on a multi-tenant service across 3 regions and is rate-limited to 20 requests per minute. It's best for development and quick testing. Dedicated endpoints are single-tenant services deployed to any of 43 regions with no rate limits, consistent latency, and configurable resources, built for scalable production workloads.

  • Dedicated endpoint deployment typically takes one to two minutes. This includes container provisioning, startup, and an initial health check to validate the service is ready. Once the endpoint is ready, it begins accepting inference requests immediately.

  • Model deployment is the process of making a trained computer vision model available to receive and process real-world data. Once deployed, computer vision applications can send images and video frames to the model via API and receive predictions, enabling everything from automated quality inspection to real-time object detection in production systems. On Ultralytics Platform, deployment is integrated directly into the end-to-end training workflow. Once your model is trained, you can test it in the browser, deploy it to a dedicated endpoint in any of 43 global regions, and monitor its performance, all from the same workspace.

Start deploying today!

Take your trained models to production across 43 global regions with auto-scaling and real-time monitoring.