Explore how scalability empowers AI systems to handle growth. Learn to optimize MLOps with [Ultralytics YOLO26](https://docs.ultralytics.com/models/yolo26/) and the [Ultralytics Platform](https://platform.ultralytics.com) for high-performance, distributed training and seamless deployment.
Scalability refers to the capability of a system, network, or process to handle a growing amount of work by adding resources. In the context of Artificial Intelligence (AI) and Machine Learning (ML), scalability describes a model's or infrastructure's ability to maintain performance levels as demand increases. This demand typically manifests as larger datasets during training, higher user traffic during inference, or increased complexity in computational tasks. A scalable architecture allows for seamless expansion—whether deploying a computer vision model to a single embedded device or serving millions of API requests via cloud clusters—ensuring that inference latency remains low even under heavy load.
Designing for scalability is a critical component of successful Machine Learning Operations (MLOps). A model that functions perfectly in a controlled research environment may fail when exposed to the high-velocity data streams found in production. Effectively managing Big Data requires systems that can scale horizontally (adding more machines to a cluster) or vertically (adding more power, such as RAM or GPUs, to existing machines).
スケーラブルなAIシステムの主な利点には以下が含まれます:
スケーラブルなAIソリューションを構築するには、モデル・アーキテクチャと導入インフラの両方を最適化する必要がある。
推論時のスケーラビリティを向上させる効果的な手法の一つは、入力を順次処理するのではなくバッチ処理することである。 GPU 最大化され、全体のスループットが向上する。
from ultralytics import YOLO
# Load a scalable YOLO26 model (smaller 'n' version for speed)
model = YOLO("yolo26n.pt")
# Define a batch of images (URLs or local paths)
# Processing multiple images at once leverages parallel computation
batch_images = ["https://ultralytics.com/images/bus.jpg", "https://ultralytics.com/images/zidane.jpg"]
# Run inference on the batch
results = model(batch_images)
# Print the number of detections for the first image
print(f"Detected {len(results[0].boxes)} objects in the first image.")
スケーラビリティにより、AI技術は理論研究から世界的な産業ツールへと移行することが可能となる。
頻繁に混同されるが、スケーラビリティはパフォーマンスや効率性とは異なる概念である。