Yolo Vision Shenzhen
Shenzhen
Join now
Glossary

Recall

Learn what Recall is in machine learning, why it matters, and how it ensures AI models capture critical positive instances effectively.

Recall, also known as sensitivity or the true positive rate, is a fundamental performance metric in machine learning that measures the ability of a model to identify all relevant instances within a dataset. In the context of object detection or classification, it specifically answers the question: "Out of all the actual positive cases, how many did the model correctly find?" Achieving high recall is critical in scenarios where missing a positive instance—often referred to as a false negative—carries significant consequences. Unlike accuracy, which can be misleading when dealing with imbalanced data, recall provides a focused view on the model's effectiveness at "capturing" the target class.

The Importance of High Recall

In many artificial intelligence applications, the cost of failing to detect an object is far higher than the cost of a false alarm. A model optimized for recall minimizes false negatives, ensuring that the system casts a wide enough net to catch potential threats, anomalies, or critical conditions. This often involves a trade-off, as increasing recall can sometimes lead to a lower precision score, meaning the model might flag more non-relevant items as positive. Understanding this balance is key to developing robust machine learning solutions.

Real-World Applications

Recall is the driving metric behind many safety-critical AI solutions. Here are two prominent examples where sensitivity takes precedence:

  • Medical Diagnostics: In medical image analysis, such as screening X-rays for early signs of disease, high recall is non-negotiable. If an AI in healthcare system is used to detect tumors, it is far better for the system to flag a suspicious shadow that turns out to be benign (a false positive) than to miss a malignant tumor entirely. Doctors rely on these tools to act as a safety net, ensuring no potential health risks are overlooked.
  • Security and Surveillance: For a security alarm system, the primary goal is to detect every intrusion attempt. A system optimized for high recall ensures that if a person enters a restricted zone, the alarm triggers. While this might lead to occasional false alarms caused by wildlife, this is preferable to the system failing to detect an actual intruder. Object detection models in these scenarios are tuned to ensure maximum sensitivity to potential threats.

Recall vs. Precision

It is essential to distinguish recall from its counterpart, precision. While recall measures the quantity of relevant cases found (completeness), precision measures the quality of the positive predictions (exactness).

  • Recall: Focuses on avoiding missed detections. "Did we find all the apples?"
  • Precision: Focuses on minimizing false alarms. "Are all the things we called apples actually apples?"

These two metrics often share an inverse relationship, visualized through a Precision-Recall curve. To evaluate the overall balance between them, developers often look at the F1-score, which is the harmonic mean of both. In imbalanced datasets, looking at recall alongside the confusion matrix gives a much clearer picture of performance than accuracy alone.

Measuring Recall with Ultralytics YOLO

When training models like the cutting-edge YOLO26, recall is automatically computed during the validation phase. The framework calculates recall for each class and the mean Average Precision (mAP), helping developers gauge how well the model finds objects.

You can easily validate a trained model and view its recall metrics using Python. This snippet demonstrates how to load a model and check its performance on a standard dataset:

from ultralytics import YOLO

# Load a pretrained YOLO26 model
model = YOLO("yolo26n.pt")

# Validate the model on the COCO8 dataset
# The results object contains metrics like Precision, Recall, and mAP
metrics = model.val(data="coco8.yaml")

# Access and print the mean recall score for box detection
print(f"Mean Recall: {metrics.results_dict['metrics/recall(B)']:.4f}")

This code utilizes the Ultralytics API to run validation. If the recall is lower than required for your project, you might consider techniques like data augmentation to create more varied training examples or hyperparameter tuning to adjust the model's sensitivity. Using the Ultralytics Platform can also streamline the process of managing datasets and tracking these metrics over multiple training runs.

Improving Model Recall

To boost a model's recall, data scientists often adjust the confidence threshold used during inference. Lowering the threshold makes the model more "optimistic," accepting more predictions as positive, which increases recall but may decrease precision. Additionally, collecting more diverse training data helps the model learn to recognize hard negatives and obscure instances. For complex tasks, employing advanced architectures like Transformer blocks or exploring ensemble methods can also improve the system's ability to detect subtle features that simpler models might miss.

Join the Ultralytics community

Join the future of AI. Connect, collaborate, and grow with global innovators

Join now