Meet YOLO26: next-gen vision AI.
Ultralytics
Back to Ultralytics Glossary

Distributed Training

Explore how distributed training scales AI workloads across multiple GPUs. Learn to accelerate Ultralytics YOLO26 training with DDP for faster, accurate results.

Distributed training is a method in machine learning where the workload of training a model is split across multiple processors or machines. This approach is essential for handling large-scale datasets and complex neural network architectures that would otherwise take an impractical amount of time to train on a single device. By leveraging the combined computational power of multiple Graphics Processing Units (GPUs) or Tensor Processing Units (TPUs), distributed training significantly accelerates the development cycle, allowing researchers and engineers to iterate faster and achieve higher accuracy in their models.

Link to this sectionHow Distributed Training Works#

The core idea behind distributed training is parallelization. Instead of processing data sequentially on one chip, the task is divided into smaller chunks that are processed simultaneously. There are two primary strategies for achieving this:

  • Data Parallelism: This is the most common approach for tasks like object detection. In this setup, a copy of the entire model is placed on every device. The global training data is split into smaller batches, and each device processes a different batch at the same time. After each step, the gradients (updates to the model) are synchronized across all devices to ensure the model weights remain consistent.
  • Model Parallelism: When a neural network (NN) is too large to fit into the memory of a single GPU, the model itself is split across multiple devices. Different layers or components of the model reside on different chips, and data flows between them. This is often necessary for training massive foundation models and Large Language Models (LLMs).

Link to this sectionReal-World Applications#

Distributed training has transformed industries by making it possible to solve problems that were previously computationally infeasible.

  • Autonomous Driving: Developing safe autonomous vehicles requires analyzing petabytes of video and sensor data. Automotive engineers use large distributed clusters to train vision models for real-time semantic segmentation and lane detection. This massive scale ensures that the AI in automotive systems can react reliably to diverse road conditions.
  • Medical Imaging: In the healthcare sector, analyzing high-resolution 3D scans like MRIs requires significant memory and processing power. Distributed training enables researchers to build high-performance diagnostic tools for tumor detection and other critical tasks. By using frameworks such as NVIDIA MONAI, hospitals can train models on diverse datasets without hitting memory bottlenecks, improving AI in healthcare outcomes.

Link to this sectionUtilizing Distributed Training with Ultralytics#

The ultralytics library makes it straightforward to implement Distributed Data Parallel (DDP) training. You can scale your training of state-of-the-art YOLO26 models across multiple GPUs by simply specifying the device indices in your training arguments.

from ultralytics import YOLO

# Load a pre-trained YOLO26 model
model = YOLO("yolo26n.pt")

# Train the model using two GPUs (device 0 and 1)
# The library automatically handles the DDP communication backend
results = model.train(data="coco8.yaml", epochs=100, device=[0, 1])

It is helpful to distinguish distributed training from similar terms in the machine learning ecosystem to understand their specific roles:

  • Vs. Federated Learning: While both involve multiple devices, their goals differ. Distributed training usually centralizes data in a high-performance cluster to maximize speed. In contrast, federated learning keeps data decentralized on user devices (like smartphones) to prioritize data privacy, updating the global model without raw data ever leaving the source.
  • Vs. High-Performance Computing (HPC): HPC is a broad field that includes supercomputing for scientific simulations like weather forecasting. Distributed training is a specific application of HPC applied to optimization algorithms in deep learning. It often relies on specialized communication libraries like NVIDIA NCCL to minimize latency between GPUs.

Link to this sectionScaling with Cloud Platforms#

Managing the infrastructure for distributed training can be complex. Modern platforms simplify this by offering managed environments. For example, the Ultralytics Platform allows users to manage datasets and initiate training runs that can be deployed to cloud environments or local clusters. This integration streamlines the workflow from data annotation to final model deployment, ensuring that scaling up to multiple GPUs is as seamless as possible. Similarly, cloud providers like Google Cloud Vertex AI and Amazon SageMaker provide robust infrastructure for running distributed training jobs at enterprise scale.

Explore solutions

Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more

Let's build the future of AI together!

Begin your journey with the future of machine learning