딥 러닝에서 배치 크기가 미치는 영향을 알아보고, 학습 속도, 메모리 사용량 및 모델 성능을 효율적으로 최적화하는 방법을 알아보세요.
In the realm of machine learning and particularly deep learning, Batch Size refers to the number of training examples utilized in one iteration of model training. Rather than feeding the entire training data into the neural network at once—which is often computationally impossible due to memory constraints—the dataset is divided into smaller subsets called batches. The model processes one batch, calculates the error, and updates its internal model weights via backpropagation before moving on to the next batch. This hyperparameter plays a pivotal role in determining both the speed of training and the stability of the learning process.
The choice of batch size fundamentally alters how the optimization algorithm, typically a variant of stochastic gradient descent, navigates the loss landscape.
Practitioners must often select a batch size based on hardware limitations rather than purely theoretical preference. Deep learning models, especially large architectures like transformers or advanced convolutional networks, are stored in the VRAM of a GPU.
When utilizing NVIDIA CUDA for acceleration, the VRAM must hold the model parameters, the batch of input data, and the intermediate activation outputs needed for gradient calculation. If the batch size exceeds the available memory, the training will crash with an "Out of Memory" (OOM) error. Techniques like mixed precision training are often employed to reduce memory usage, allowing for larger batch sizes on the same hardware.
To configure training effectively, it is essential to distinguish batch size from other temporal terms in the training loop.
Adjusting the batch size is a routine necessity when deploying computer vision solutions across various industries.
사용 시 Ultralytics Python 패키지, setting the batch size
is straightforward. You can specify a fixed integer or use the dynamic batch=-1 설정, 이는 활용하는
the 자동 배치 기능 하드웨어가 안전하게 처리할 수 있는 최대 배치 크기를 자동으로 계산합니다.
다음 예제는 특정 배치 설정을 사용하여 속도와 정확도 측면에서 최신 표준인 YOLO26 모델을훈련하는 방법을 보여줍니다.
from ultralytics import YOLO
# Load the YOLO26n model (nano version for speed)
model = YOLO("yolo26n.pt")
# Train on the COCO8 dataset
# batch=16 is manually set.
# Alternatively, use batch=-1 for auto-tuning based on available GPU memory.
results = model.train(data="coco8.yaml", epochs=5, batch=16)
For managing large-scale experiments and visualizing how different batch sizes affect your training metrics, tools like the Ultralytics Platform provide a comprehensive environment for logging and comparing runs. Proper hyperparameter tuning of the batch size is often the final step in squeezing the best performance out of your model.