Erfahren Sie, wie Sie Underfitting in Modellen des maschinellen Lernens erkennen, verhindern und beheben können – mit Expertentipps, Strategien und Beispielen aus der Praxis.
Underfitting occurs when a machine learning model is too simple or lacks the capacity to capture the underlying trends and patterns within the training data. Conceptually, it is analogous to trying to fit a straight line through data points that form a distinct curve; the model fails to grasp the complexity of the relationship between inputs and outputs. Because the model has not learned the data effectively, it exhibits poor performance not only on the training set but also on unseen validation data, leading to low predictive accuracy. This phenomenon is often a result of high bias in AI, where the algorithm makes overly simplistic assumptions about the target function.
Several factors can lead to an underfitted model. The most common cause is using a model architecture that is not complex enough for the task at hand, such as applying linear regression to non-linear data. Insufficient training duration, where the model is not given enough epochs to converge, also prevents adequate learning. Furthermore, excessive regularization—a technique usually used to prevent the opposite problem—can overly constrain the model, stopping it from capturing important features.
Engineers can identify underfitting by monitoring loss functions during training. If both the training error and the validation error remain high and do not decrease significantly, the model is likely underfitting. In contrast to effective feature engineering, which helps models understand data, providing too few features can also starve the model of necessary information.
It is crucial to distinguish underfitting from its counterpart, overfitting. These two concepts represent the opposite ends of the bias-variance tradeoff.
Finding the "sweet spot" between these two extremes is the primary goal of model optimization.
Understanding underfitting is vital for developing reliable AI systems across various industries.
In Computervision, underfitting often happens
when using a model variant that is too small for the difficulty of the task (e.g., detecting small objects in
high-resolution drone imagery). The following Python example demonstrates how to
switch from a smaller model to a larger, more capable model using the ultralytics library to resolve
potential underfitting.
from ultralytics import YOLO
# If 'yolo26n.pt' (Nano) is underfitting and yielding low accuracy,
# upgrade to a model with higher capacity like 'yolo26l.pt' (Large).
model = YOLO("yolo26l.pt")
# Train the larger model.
# Increasing epochs also helps the model converge if it was previously underfitting.
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
By moving to a larger Ultralytics YOLO26 model and ensuring adequate training duration, the system gains the parameters necessary to learn complex patterns, effectively mitigating underfitting. To verify your model is no longer underfitting, always evaluate it against a robust test dataset. For managing datasets and tracking experiments to spot underfitting early, the Ultralytics Platform offers comprehensive tools for visualization and analysis.