深圳Yolo 视觉
深圳
立即加入
词汇表

混淆矩阵

通过混淆矩阵了解模型性能。探索各项指标、实际应用以及优化 AI 分类准确性的工具。

A confusion matrix is a performance measurement tool for machine learning classification problems where the output can be two or more classes. It is a table with four different combinations of predicted and actual values, serving as a foundational element for data visualization in model evaluation. Unlike simple accuracy, which can be misleading if the dataset is unbalanced, a confusion matrix provides a granular breakdown of where a computer vision (CV) model is making mistakes. By comparing the predictions against the ground truth labels, developers can determine if the system is confusing two specific classes or failing to detect an object entirely.

矩阵的核心组成部分

The matrix itself is typically divided into four quadrants for binary classification, though it expands for multi-class problems like those handled by Ultralytics YOLO26. These four components represent the intersection of what the model predicted versus what actually exists in the image.

  • True Positives (TP): The model correctly predicts the positive class. For example, in an object detection task, the model successfully draws a bounding box around a person who is actually in the frame.
  • True Negatives (TN): The model correctly predicts the negative class. This is crucial in scenarios like anomaly detection, where the system correctly identifies that a manufactured part has no defects.
  • False Positives (FP): The model incorrectly predicts the positive class. Often called a "Type I error," this occurs when the system detects an object that isn't there, such as a security camera flagging a shadow as an intruder.
  • False Negatives (FN): The model incorrectly predicts the negative class. Known as a "Type II error," this happens when the model fails to detect an object that is present, essentially "missing" the target.

Derived Metrics and Significance

The raw numbers in a confusion matrix are used to calculate more advanced metrics that describe model performance. Understanding these derivatives is essential for optimizing neural networks.

  • Precision: Calculated as TP / (TP + FP), this metric reveals how accurate the positive predictions are. High precision means fewer false alarms.
  • Recall (Sensitivity): Calculated as TP / (TP + FN), this measures the ability of the model to find all positive instances. High recall is vital when missing an object has severe consequences.
  • F1 Score: The harmonic mean of precision and recall. It provides a single score that balances the trade-off between the two, useful for comparing different YOLO26 models.

实际应用

The specific cost of errors defined by the confusion matrix dictates how models are tuned for different industries.

In the field of AI in healthcare, the confusion matrix is a matter of safety. When training a model for medical image analysis to detect tumors, a False Negative (missing a tumor) is far worse than a False Positive (flagging a benign spot for doctor review). Therefore, engineers prioritize Recall over Precision in these matrices to ensure no potential health risks are overlooked.

Conversely, in manufacturing quality control, efficiency is key. If a system classifying assembly line parts generates too many False Positives (flagging good parts as defective), it causes unnecessary waste and slows down production. Here, the confusion matrix helps engineers tune the model to maximize Precision, ensuring that what is rejected is truly defective, streamlining automated machine learning workflows.

Generating a Confusion Matrix with YOLO26

When using modern frameworks, generating this matrix is often part of the standard validation pipeline. The example below demonstrates how to validate a YOLO26模型 and access the confusion matrix data using the ultralytics 包装

from ultralytics import YOLO

# Load a pre-trained YOLO26 model
model = YOLO("yolo26n.pt")

# Validate the model on the COCO8 dataset
# This automatically generates and plots the confusion matrix
metrics = model.val(data="coco8.yaml")

# Access the confusion matrix object directly
print(metrics.confusion_matrix.matrix)

区分相关概念

It is important to distinguish the confusion matrix from similar evaluation terms.

  • Vs. Accuracy: Accuracy is simply the ratio of correct predictions to total predictions. While useful, accuracy can be highly deceptive in imbalanced datasets. For instance, if 95% of emails are not spam, a model that predicts "not spam" for every email has 95% accuracy but is useless. The confusion matrix reveals this flaw by showing zero True Positives for the spam class.
  • Vs. ROC Curve: The confusion matrix provides a snapshot of performance at a single specific confidence threshold. In contrast, the Receiver Operating Characteristic (ROC) curve visualizes how the True Positive Rate and False Positive Rate change as that threshold is varied. Tools like the Ultralytics Platform allow users to explore both visualizations to choose the optimal operating point for their deployment.

加入Ultralytics 社区

加入人工智能的未来。与全球创新者联系、协作和共同成长

立即加入