Yolo Vision Shenzhen
Шэньчжэнь
Присоединиться сейчас
Глоссарий

One-Shot Learning

Discover how One-Shot Learning enables AI to recognize objects from a single example. Learn about Siamese networks, embeddings, and real-world apps using YOLO26.

One-Shot Learning is a specialized classification technique in machine learning (ML) designed to learn information about object categories from a single training example. Unlike traditional deep learning (DL) algorithms, which require massive datasets containing thousands of annotated images to generalize effectively, One-Shot Learning mimics the human cognitive ability to grasp a new concept instantly. For instance, a person can usually recognize a specific exotic bird after seeing it just once; this methodology attempts to replicate that efficiency in artificial intelligence (AI) systems. It is particularly valuable in scenarios where data labeling is expensive, data is scarce, or new categories must be added dynamically without retraining the entire model.

Mechanisms Behind the Concept

The core principle of One-Shot Learning involves shifting the objective from standard classification to similarity evaluation. Instead of training a neural network (NN) to output a specific class label (e.g., "dog" or "cat"), the model learns a distance function. A common architecture employed for this is the Siamese neural network, which consists of two identical sub-networks that share the same model weights.

During operation, the network performs feature extraction to convert input images into compact numerical vectors known as embeddings. The system then compares the embedding of a new query image against the embedding of the single reference "shot." If the mathematical distance—often calculated using Euclidean distance or cosine similarity—is below a certain threshold, the images are determined to belong to the same class. This allows the model to verify identity or classify objects based on their proximity in the learned feature space.

The following Python code demonstrates how to extract embeddings and calculate similarity using a YOLO26 модель классификации из ultralytics пакет.

import numpy as np
from ultralytics import YOLO

# Load a pre-trained YOLO26 classification model for feature extraction
model = YOLO("yolo26n-cls.pt")

# Extract embeddings for a reference 'shot' and a query image
# The embed() method returns the feature vector directly
shot_vec = model.embed("reference_img.jpg")[0]
query_vec = model.embed("query_img.jpg")[0]

# Calculate similarity (higher dot product implies greater similarity)
similarity = np.dot(shot_vec, query_vec) / (np.linalg.norm(shot_vec) * np.linalg.norm(query_vec))

print(f"Similarity Score: {similarity:.4f}")

Distinguishing Related Paradigms

It is important to differentiate One-Shot Learning from other data-efficient learning techniques, as they solve similar problems through different constraints:

  • Few-Shot Learning (FSL): This is the broader category encompassing One-Shot Learning. In FSL, the model is provided with a small "support set" of examples, typically ranging from two to five images per class. One-Shot Learning is simply the extreme case where the support set size is exactly one.
  • Zero-Shot Learning (ZSL): ZSL deals with recognizing categories the model has never seen visually. Instead of a reference image, ZSL relies on semantic attributes or text descriptions (e.g., identifying a "zebra" by associating visual features with the text description "striped horse") via natural language processing (NLP).
  • Transfer Learning: This involves taking a model pre-trained on a large database like ImageNet and fine-tuning it on a new task. While transfer learning powers the feature extractors used in One-Shot Learning, standard transfer learning usually requires more than one example to update weights effectively without overfitting.

Применение в реальном мире

Однократное обучение открыло новые возможности в тех областях, где сбор больших объемов учебных данных нецелесообразен.

Facial Recognition and Security

The most ubiquitous application of One-Shot Learning is in biometric security. When setting up Face ID on a smartphone or enrolling in an employee access system, the device captures a single mathematical representation of the user's face. During daily use, the facial recognition system compares the live camera feed against this stored "one shot" to verify identity. This relies on robust embedding techniques, such as those discussed in the foundational FaceNet research, to ensure that changes in lighting or angle do not break the similarity match.

Промышленный контроль качества

In AI in manufacturing, creating a balanced dataset of "defective" parts is difficult because defects are rare and inconsistent. One-Shot Learning allows computer vision (CV) systems to learn the representation of a single "perfect" reference part. Any item on the assembly line that yields an embedding significantly distant from this reference is flagged for anomaly detection. This enables immediate quality assurance without needing thousands of images of broken parts, which can be managed and deployed via the Ultralytics Platform.

Проблемы и перспективы

While powerful, One-Shot Learning is susceptible to noise; if the single reference image is blurry, obstructed, or unrepresentative, the model's ability to recognize that class degrades significantly. Researchers often employ meta-learning, or "learning to learn," to improve model stability and generalization. As architectures evolve, newer models like YOLO26 are incorporating more robust feature extractors that make one-shot inference faster and more accurate, paving the way for more adaptive and intelligent edge AI devices.

Присоединяйтесь к сообществу Ultralytics

Присоединяйтесь к будущему ИИ. Общайтесь, сотрудничайте и развивайтесь вместе с мировыми новаторами

Присоединиться сейчас