Glossary

Confidence

Define AI confidence scores. Learn how models gauge prediction certainty, set thresholds for reliability, and distinguish confidence from accuracy.

In the realm of machine learning and artificial intelligence, a confidence score is a numerical value that represents the likelihood that a specific prediction made by a model is correct. Typically expressed as a probability between 0 and 1 (or a percentage from 0% to 100%), this score quantifies the certainty of the neural network regarding its output. For instance, in an object detection task, the system might predict the presence of a "cat" with a confidence of 0.95, indicating a strong belief in the accuracy of that label. These scores are usually derived from the final layer of the model using activation functions such as the softmax function for multi-class problems or the sigmoid function for binary classification.

The Role of Confidence in Inference

Confidence scores are a fundamental component of the inference engine workflow. They allow developers to filter predictions based on a required level of certainty, a process known as thresholding. By setting a specific confidence threshold, you can effectively manage the trade-off between identifying every possible object (high recall) and ensuring that identified objects are correct (high precision).

In practical model deployment, raw predictions often contain noise or low-probability detections. Techniques like non-maximum suppression (NMS) utilize confidence scores to eliminate redundant overlapping boxes, keeping only the detection with the highest probability. This ensures that the final output presented to the user is clean and actionable.

The following example demonstrates how to apply a confidence threshold during inference using Ultralytics YOLO11:

from ultralytics import YOLO

# Load a pretrained YOLO11 model
model = YOLO("yolo11n.pt")

# Run inference on an image with a confidence threshold of 0.6 (60%)
# This filters out any detections with a confidence score lower than 0.6
results = model.predict("https://ultralytics.com/images/bus.jpg", conf=0.6)

# Display the count of objects detected above the threshold
print(f"Detected {len(results[0].boxes)} objects with high confidence.")

Real-World Applications

The utility of confidence scores extends across virtually every industry deploying computer vision and AI solutions.

Autonomous Systems: In the development of autonomous vehicles, safety is paramount. Perception systems use confidence scores to fuse data from cameras and LiDAR. If a vision model detects an obstacle with low confidence, the system might cross-reference this with radar data before triggering emergency braking. This layered approach, crucial for AI in automotive, helps prevent dangerous phantom braking events caused by false positives.
Medical Diagnostics: In medical image analysis, AI tools assist doctors by flagging potential anomalies in X-rays or MRIs. A system designed for AI in healthcare might auto-triage cases based on confidence. High-confidence detections of pathologies are prioritized for immediate review by a radiologist, while low-confidence regions might be highlighted simply for a "second look," ensuring that the AI acts as a supportive assistant rather than a definitive decision-maker.

Confidence vs. Accuracy and Precision

It is vital for practitioners to distinguish "confidence" from standard evaluation metrics used to benchmark models.

Confidence vs. Accuracy: Accuracy measures the overall correctness of a model across an entire dataset (e.g., "The model is 90% accurate"). In contrast, confidence is a prediction-specific value (e.g., "I am 90% sure this specific image is a dog"). A model can be generally accurate but still output low confidence on difficult examples.
Confidence vs. Precision: Precision calculates the percentage of positive predictions that were actually correct. While related, a high confidence score does not guarantee high precision if the model is suffering from overfitting or is poorly calibrated.
Calibration: A model is considered "well-calibrated" if its confidence scores reflect the true probability of correctness. For example, among all predictions made with 0.8 confidence, approximately 80% should be actual positive matches.

Improving Model Confidence

If a model consistently yields low confidence for valid objects, it may indicate issues with the training data. Strategies to improve this include data augmentation to expose the model to more varied lighting and orientations, or employing active learning to annotate and retrain on the specific "edge cases" where the model is currently uncertain. Ensuring diverse and high-quality datasets is essential for building robust systems that users can trust.

Confidence

Train Ultralytics YOLO models to streamline workflows across industries

Flexible enterprise licensing solution to power your innovation

Train AI models in seconds with Ultralytics YOLO

The Role of Confidence in Inference

Real-World Applications

Confidence vs. Accuracy and Precision

Improving Model Confidence

Read more in this category

Understanding why human-in-the-loop annotation is key

What is dataset distillation? A quick overview

Oakley Meta AI glasses are redefining eyewear with Vision AI

Join the Ultralytics community