Explore how scalability empowers AI systems to handle growth. Learn to optimize MLOps with [Ultralytics YOLO26](https://docs.ultralytics.com/models/yolo26/) and the [Ultralytics Platform](https://platform.ultralytics.com) for high-performance, distributed training and seamless deployment.
Scalability refers to the capability of a system, network, or process to handle a growing amount of work by adding resources. In the context of Artificial Intelligence (AI) and Machine Learning (ML), scalability describes a model's or infrastructure's ability to maintain performance levels as demand increases. This demand typically manifests as larger datasets during training, higher user traffic during inference, or increased complexity in computational tasks. A scalable architecture allows for seamless expansion—whether deploying a computer vision model to a single embedded device or serving millions of API requests via cloud clusters—ensuring that inference latency remains low even under heavy load.
Designing for scalability is a critical component of successful Machine Learning Operations (MLOps). A model that functions perfectly in a controlled research environment may fail when exposed to the high-velocity data streams found in production. Effectively managing Big Data requires systems that can scale horizontally (adding more machines to a cluster) or vertically (adding more power, such as RAM or GPUs, to existing machines).
可扩展人工智能系统的关键优势包括:
创建可扩展的人工智能解决方案需要优化模型架构和部署基础设施。
在推理过程中提升可扩展性的有效方法之一是批量处理输入数据而非顺序处理。 这能最大限度GPU 并提升整体吞吐量。
from ultralytics import YOLO
# Load a scalable YOLO26 model (smaller 'n' version for speed)
model = YOLO("yolo26n.pt")
# Define a batch of images (URLs or local paths)
# Processing multiple images at once leverages parallel computation
batch_images = ["https://ultralytics.com/images/bus.jpg", "https://ultralytics.com/images/zidane.jpg"]
# Run inference on the batch
results = model(batch_images)
# Print the number of detections for the first image
print(f"Detected {len(results[0].boxes)} objects in the first image.")
可扩展性使人工智能技术能够从理论研究过渡到全球工业工具。
尽管常被混为一谈,可扩展性与性能和效率是截然不同的概念。