Yolo Tầm nhìn Thâm Quyến
Thâm Quyến
Tham gia ngay
Bảng chú giải thuật ngữ

Kích thước Batch

Khám phá tác động của kích thước lô (batch size) đối với học sâu. Tối ưu hóa tốc độ huấn luyện, mức sử dụng bộ nhớ và hiệu suất mô hình một cách hiệu quả.

In the realm of machine learning and particularly deep learning, Batch Size refers to the number of training examples utilized in one iteration of model training. Rather than feeding the entire training data into the neural network at once—which is often computationally impossible due to memory constraints—the dataset is divided into smaller subsets called batches. The model processes one batch, calculates the error, and updates its internal model weights via backpropagation before moving on to the next batch. This hyperparameter plays a pivotal role in determining both the speed of training and the stability of the learning process.

The Dynamics of Training with Batches

The choice of batch size fundamentally alters how the optimization algorithm, typically a variant of stochastic gradient descent, navigates the loss landscape.

  • Small Batch Sizes: Using a small number (e.g., 8 or 16) results in "noisy" updates. While the gradient estimation is less accurate for the dataset as a whole, this noise can sometimes help the model escape local minima, potentially leading to better generalization. However, smaller batches require more updates per epoch, which can make training slower in terms of wall-clock time due to overhead.
  • Large Batch Sizes: A larger batch (e.g., 128 or 256) provides a more accurate estimate of the gradient, leading to smoother convergence of the loss function. It allows for massive parallelization on modern hardware, significantly speeding up calculation. However, if the batch is too large, the model might settle into sharp, suboptimal minima, leading to overfitting and reduced ability to generalize to new data.

Hardware and Memory Implications

Practitioners must often select a batch size based on hardware limitations rather than purely theoretical preference. Deep learning models, especially large architectures like transformers or advanced convolutional networks, are stored in the VRAM of a GPU.

When utilizing NVIDIA CUDA for acceleration, the VRAM must hold the model parameters, the batch of input data, and the intermediate activation outputs needed for gradient calculation. If the batch size exceeds the available memory, the training will crash with an "Out of Memory" (OOM) error. Techniques like mixed precision training are often employed to reduce memory usage, allowing for larger batch sizes on the same hardware.

Phân biệt các khái niệm liên quan

To configure training effectively, it is essential to distinguish batch size from other temporal terms in the training loop.

  • Batch Size vs. Epoch: An epoch represents one complete pass through the entire training dataset. The batch size determines how many chunks the data is split into within that epoch. For example, if you have 1,000 images and a batch size of 100, it will take 10 iterations to complete one epoch.
  • Batch Size vs. Iteration: An iteration (or step) is the act of processing one batch and updating the weights. The total number of iterations in training is the number of batches per epoch multiplied by the total number of epochs.
  • Batch Size vs. Batch Normalization: While they share a name, Batch Normalization is a specific layer type that normalizes layer inputs based on the mean and variance of the current batch. This technique relies heavily on the batch size; if the batch size is too small (e.g., 2), the statistical estimates become unreliable, potentially degrading performance.

Các Ứng dụng Thực tế

Adjusting the batch size is a routine necessity when deploying computer vision solutions across various industries.

  1. High-Fidelity Medical Imaging: In the field of AI in healthcare, practitioners often work with 3D volumetric data such as MRI or CT scans. These files are incredibly dense and memory-intensive. To perform tasks like medical image analysis or complex image segmentation without crashing the system, engineers often reduce the batch size to a very small number, sometimes even a batch of 1. Here, the priority is processing high-resolution detail rather than raw training speed.
  2. Industrial Quality Control: Conversely, in AI in manufacturing, speed is paramount. Automated systems inspecting products on a conveyor belt need to process thousands of images per hour. During inference, engineers might aggregate incoming camera feeds into larger batches to maximize the utilization of edge AI devices, ensuring high throughput for real-time defect detection.

Cấu hình kích thước lô trong Python

Khi sử dụng Ultralytics Python bưu kiện, setting the batch size is straightforward. You can specify a fixed integer or use the dynamic batch=-1 cài đặt, sử dụng Tính năng AutoBatch Tự động tính toán kích thước lô tối đa mà phần cứng của bạn có thể xử lý một cách an toàn.

Ví dụ sau đây minh họa cách huấn luyện mô hình YOLO26 — tiêu chuẩn mới nhất về tốc độ và độ chính xác — sử dụng thiết lập batch cụ thể.

from ultralytics import YOLO

# Load the YOLO26n model (nano version for speed)
model = YOLO("yolo26n.pt")

# Train on the COCO8 dataset
# batch=16 is manually set.
# Alternatively, use batch=-1 for auto-tuning based on available GPU memory.
results = model.train(data="coco8.yaml", epochs=5, batch=16)

For managing large-scale experiments and visualizing how different batch sizes affect your training metrics, tools like the Ultralytics Platform provide a comprehensive environment for logging and comparing runs. Proper hyperparameter tuning of the batch size is often the final step in squeezing the best performance out of your model.

Tham gia Ultralytics cộng đồng

Tham gia vào tương lai của AI. Kết nối, hợp tác và phát triển cùng với những nhà đổi mới toàn cầu

Tham gia ngay