Yolo Vision Shenzhen
Shenzhen
Join now
Glossary

Recall

Learn what Recall is in machine learning, why it matters, and how it ensures AI models capture critical positive instances effectively.

Recall, also known as sensitivity or the true positive rate, is a fundamental evaluation metric used to measure the ability of a machine learning (ML) model to identify all relevant instances within a dataset. In essence, recall answers the specific question: "Out of all the actual positive cases, how many did the model successfully detect?" This metric is particularly focused on minimizing false negatives, ensuring that critical events or objects are not overlooked. While accuracy provides a general overview of performance, recall becomes the primary indicator of success in scenarios where missing a target carries a higher cost than a false alarm.

The Importance of Recall in AI

In many computer vision (CV) and data analysis tasks, the cost of errors is not uniform. Failing to detect a positive case (a Type II error) can sometimes be dangerous or expensive. High recall ensures that the system casts a wide net to capture as many true positives as possible. This is often achieved by adjusting the confidence threshold during inference; lowering the threshold generally increases recall but may result in more false positives.

Engineers often analyze the precision-recall curve to understand the trade-offs inherent in their models. A model with 100% recall has found every single target object, though it might have also incorrectly labeled some background noise as targets.

Real-World Applications

Recall is the driving metric behind many safety-critical AI solutions. Here are two prominent examples where recall takes precedence:

  • Medical Diagnostics: In the field of medical image analysis, such as screening for diseases via X-rays or MRIs, high recall is non-negotiable. If an AI model for tumor detection is analyzing scans, it is far better for the system to flag a suspicious shadow that turns out to be benign (a false positive) than to miss a malignant tumor entirely (a false negative). Doctors rely on these AI in healthcare tools to act as a safety net, ensuring no potential health risks are ignored.
  • Security and Surveillance: For a security alarm system, the primary goal is to detect every intrusion attempt. A system optimized for high recall ensures that if a person enters a restricted zone, the alarm triggers. While this might lead to occasional false alarms caused by animals or shadows, this is preferable to the system failing to detect an actual intruder. Object detection models in these scenarios are tuned to ensure maximum sensitivity to potential threats.

Recall vs. Precision and Accuracy

Understanding the difference between recall and related metrics is crucial for interpreting model evaluation insights.

  • Recall vs. Precision: While recall measures the quantity of true positives found, precision measures the quality or reliability of those positive predictions. Precision asks, "Of all the items labeled as positive, how many were actually positive?" There is often a trade-off; increasing recall by accepting lower-confidence detections usually lowers precision. The F1-score is a metric that combines both to provide a balanced view.
  • Recall vs. Accuracy: Accuracy measures the overall percentage of correct predictions (both positive and negative). However, on imbalanced datasets—such as a manufacturing line where 99% of parts are good and only 1% are defective—a model could simply predict "good" every time and achieve 99% accuracy while having 0% recall for defects. In such anomaly detection tasks, recall is a much more honest metric than accuracy.

Measuring Recall with Ultralytics YOLO

When developing models with the Ultralytics YOLO11 architecture, recall is automatically calculated during the validation process. The framework computes recall for each class and the mean Average Precision (mAP), helping developers gauge how well the model finds objects.

You can easily validate a trained model and view its recall metrics using Python:

from ultralytics import YOLO

# Load a pretrained YOLO11 model
model = YOLO("yolo11n.pt")

# Validate the model on a standard dataset like COCO8
# The results will include Precision (P), Recall (R), and mAP
metrics = model.val(data="coco8.yaml")

# Access the mean recall score from the results
print(f"Mean Recall: {metrics.results_dict['metrics/recall(B)']}")

This code snippet loads a YOLO11 model and runs validation on the COCO8 dataset. The output provides a comprehensive breakdown of performance, allowing you to assess if your model meets the necessary recall requirements for your specific application. If the recall is too low, you might consider techniques like data augmentation or hyperparameter tuning to improve sensitivity.

Join the Ultralytics community

Join the future of AI. Connect, collaborate, and grow with global innovators

Join now