Explore how [label smoothing](https://www.ultralytics.com/glossary/label-smoothing) prevents overfitting and improves model generalization. Learn to implement this technique with [YOLO26](https://docs.ultralytics.com/models/yolo26/) for better accuracy.
Label smoothing is a regularization technique widely used in machine learning to improve model generalization and prevent overfitting. When training neural networks, the goal is typically to minimize the error between predictions and ground truth. However, if a model becomes too confident in its predictions—assigning near 100% probability to a single class—it often begins to memorize the specific noise in the training data rather than learning robust patterns. This phenomenon, known as overfitting, degrades performance on new, unseen examples. Label smoothing addresses this by discouraging the model from predicting with absolute certainty, essentially telling the network that there is always a small margin for error.
To understand how label smoothing operates, it helps to contrast it with standard "hard" targets. in
traditional 지도 학습, classification
labels are usually represented via one-hot encoding. For instance,
in a task distinguishing between cats and dogs, a "dog" image would have a target vector of
[0, 1]. To match this perfectly, the model pushes its internal scores, known as
로짓, toward infinity, which can lead
to unstable gradients and an inability to adapt.
Label smoothing replaces these rigid 1s and 0s with "soft" targets. Instead of a target probability of
1.0, the correct class might be assigned 0.9, while the remaining probability mass
(0.1) is distributed uniformly across the incorrect classes. This subtle shift modifies the objective of
the 손실 함수, such as
cross-entropy, preventing the
활성화 함수 (usually
Softmax) from saturating. The result is a model that learns
tighter clusters of classes in the feature space and produces better
model calibration, meaning the predicted
probabilities more accurately reflect the true likelihood of correctness.
This technique is particularly critical in domains where data ambiguity is inherent or datasets are prone to labeling errors.
Modern deep learning frameworks simplify the application of this technique. Using the
ultralytics package, you can easily integrate label smoothing into your training pipeline for
이미지 분류 or detection tasks. This
is often done to squeeze extra performance out of state-of-the-art models like
YOLO26.
The following example demonstrates how to train a classification model with label smoothing enabled:
from ultralytics import YOLO
# Load a pre-trained YOLO26 classification model
model = YOLO("yolo26n-cls.pt")
# Train with label_smoothing set to 0.1
# The target for the correct class becomes 1.0 - 0.5 * 0.1 = 0.95 (depending on implementation specifics)
model.train(data="mnist", epochs=5, label_smoothing=0.1)
It is helpful to distinguish label smoothing from other regularization strategies to understand when to use it.
By mitigating the vanishing gradient problem in the final layers and encouraging the model to learn more robust features, label smoothing remains a staple in modern deep learning architectures.