Glossary

Mixed Precision

Boost deep learning efficiency with mixed precision training! Achieve faster speeds, reduced memory usage, and energy savings without sacrificing accuracy.

Mixed precision is a technique used in deep learning to speed up model training and reduce memory consumption. It involves using a combination of lower-precision numerical formats, like 16-bit floating-point (FP16), and higher-precision formats, such as 32-bit floating-point (FP32), during computation. By strategically using lower-precision numbers for certain parts of the model, such as weight multiplication, and keeping critical components like weight updates in higher precision, mixed precision training can significantly accelerate performance on modern GPUs without a substantial loss in model accuracy.

How Mixed Precision Works

The core idea behind mixed precision is to leverage the speed and memory efficiency of lower-precision data types. Modern hardware, especially NVIDIA GPUs with Tensor Cores, can perform operations on 16-bit numbers much faster than on 32-bit numbers. The process typically involves three key steps:

Casting to Lower Precision: Most of the model's operations, particularly the computationally intensive matrix multiplications and convolutions, are performed using half-precision (FP16) arithmetic. This reduces the memory footprint and speeds up calculations.
Maintaining a Master Copy of Weights: To maintain model accuracy and stability, a master copy of the model's weights is kept in the standard 32-bit floating-point (FP32) format. This master copy is used to accumulate gradients and update the weights during the training process.
Loss Scaling: To prevent numerical underflow—where small gradient values become zero when converted to FP16—a technique called loss scaling is used. It involves multiplying the loss by a scaling factor before backpropagation to keep the gradient values within a representable range for FP16. Before the weights are updated, the gradients are scaled back down.

Deep learning frameworks like PyTorch and TensorFlow have built-in support for automatic mixed precision, making it easy to implement.

Applications and Examples

Mixed precision is widely adopted in training large-scale machine learning (ML) models, where efficiency is paramount.

Training Large Language Models (LLMs): Models like GPT-3 and BERT have billions of parameters. Training them using only FP32 would require prohibitive amounts of GPU memory and time. Mixed precision makes training such foundation models feasible by significantly reducing memory needs and accelerating computations. This allows researchers to iterate faster and build even more powerful language models.
Accelerating Computer Vision Models: In computer vision (CV), mixed precision speeds up the training of complex models like Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). For tasks like object detection and image segmentation, Ultralytics YOLO models, including the latest Ultralytics YOLO11, leverage mixed precision for faster convergence. This is especially useful for hyperparameter tuning and rapid development within platforms like Ultralytics HUB. Faster training also facilitates quicker experimentation on large datasets such as COCO. Mixed precision can also be used during inference to accelerate model deployment, particularly when exporting to formats like TensorRT, which is heavily optimized for lower precisions.

Mixed Precision

Flexible enterprise licensing solution to power your innovation

Train AI models in seconds with Ultralytics YOLO

Train YOLO models simply with Ultralytics HUB

How Mixed Precision Works

Applications and Examples

Read more in this category

Industrial Internet of things (IIoT) explained

Key highlights from Ultralytics at WAIC 2025 in Shanghai

How is tea made using technologies like Vision AI?

Join the Ultralytics community

Mixed Precision

Flexible enterprise licensing solution to power your innovation

Train AI models in seconds with Ultralytics YOLO

Train YOLO models simply with Ultralytics HUB

How Mixed Precision Works

Applications and Examples

Related Concepts

Read more in this category

Industrial Internet of things (IIoT) explained

Key highlights from Ultralytics at WAIC 2025 in Shanghai

How is tea made using technologies like Vision AI?

Join the Ultralytics community