Yolo 深圳
深セン
今すぐ参加
用語集

混合精度

混合精度トレーニングで深層学習の効率を高めましょう!精度を犠牲にすることなく、より高速な速度、削減されたメモリ使用量、およびエネルギー節約を実現します。

Mixed precision is a pivotal technique in model optimization used to accelerate the training of deep learning models while reducing memory consumption. By strategically combining different numerical formats—typically 16-bit and 32-bit floating-point types—this method allows machine learning algorithms to perform calculations faster without sacrificing the model's final accuracy. It has become a standard practice in modern AI development, particularly for resource-intensive tasks like training the YOLO26 architecture on massive datasets.

混合精度演算の仕組み

In traditional deep learning workflows, models usually perform calculations using single-precision floating-point format (FP32). Each number in FP32 requires 32 bits of memory. While highly precise, this format can be computationally expensive and memory-hungry.

Mixed precision introduces the use of half precision (FP16), which uses only 16 bits. However, using only FP16 can lead to numerical instability due to a smaller dynamic range. To solve this, mixed precision methods maintain a "master copy" of the model weights in FP32 for stability while using FP16 for the heavy lifting of mathematical operations, such as convolutions and matrix multiplications.

The process generally involves three key steps:

  1. Casting: Converting the model's inputs and activations to FP16 to speed up execution on compatible hardware, such as NVIDIA Tensor Cores.
  2. Loss Scaling: Amplifying the loss function values to prevent "underflow," where small gradient updates become zeros in FP16.
  3. Accumulation: Performing the arithmetic operations in FP16 but accumulating the results in FP32 to preserve necessary information before updating the master weights.

Benefits in AI Training

Adopting mixed precision offers significant advantages for developers and researchers utilizing computational resources effectively:

  • Faster Training Speed: Operations in FP16 require less memory bandwidth and are processed faster by modern GPUs. This can reduce the time required for an epoch by substantial margins.
  • Reduced Memory Usage: Since FP16 tensors take up half the memory of FP32, developers can essentially double their batch size. Larger batch sizes often lead to more stable gradient estimates and faster convergence.
  • Energy Efficiency: Reduced computational load translates to lower energy consumption, which is crucial for large-scale cloud training operations.

実際のアプリケーション

Mixed precision is utilized across various industries to handle complex models and large datasets efficiently.

自動運転

In the development of autonomous vehicles, engineers must train object detection models on millions of high-resolution video frames. Using mixed precision allows them to train state-of-the-art models like YOLO26 efficiently. The reduced memory footprint enables the processing of higher-resolution inputs, which is critical for detecting small objects like traffic signs or pedestrians at a distance.

医用画像解析

Medical image analysis often involves 3D volumetric data from MRI or CT scans, which are extremely memory-intensive. Training segmentation models on this data in full FP32 precision often leads to "Out of Memory" (OOM) errors. Mixed precision enables researchers to fit these heavy models into GPU memory, facilitating the development of AI that can assist doctors in diagnosing diseases earlier.

Ultralyticsミックスドプレシジョンの実装

のような最新のフレームワークがある。 PyTorch usually handle the complexities of mixed precision automatically via a feature called Automatic Mixed Precision (AMP). The ultralytics package enables AMP by default during training to ensure optimal performance.

Here is a concise example of how to initiate training with YOLO26, where mixed precision is active by default (controllable via the amp argument):

from ultralytics import YOLO

# Load the latest YOLO26 model
model = YOLO("yolo26n.pt")

# Train the model on the COCO8 dataset
# amp=True is the default setting for mixed precision training
results = model.train(data="coco8.yaml", epochs=5, imgsz=640, amp=True)

混合精度と関連概念

It is helpful to distinguish mixed precision from similar terms in the glossary to avoid confusion:

  • Model Quantization: While mixed precision uses lower precision floating-point numbers (FP16) during training, quantization typically converts weights to integers (like INT8) after training for deployment. Quantization is primarily focused on inference latency on edge devices, whereas mixed precision focuses on training speed and stability.
  • Half Precision: This refers specifically to the FP16 data format itself. Mixed precision is the technique of using both FP16 and FP32 together. Using pure half precision without the "mixed" FP32 master copy often results in models that fail to converge due to numerical errors.

結論

Mixed precision has revolutionized how neural networks are trained, acting as a critical enabler for the massive foundation models and vision systems we see today. By balancing the need for mathematical precision with the constraints of hardware speed and memory, it allows developers to iterate faster and build more capable AI solutions.

For those looking to manage datasets and train optimized models seamlessly, the Ultralytics Platform offers a comprehensive environment that leverages these modern optimization techniques automatically.

Ultralytics コミュニティに参加する

AIの未来を共に切り開きましょう。グローバルなイノベーターと繋がり、協力し、成長を。

今すぐ参加