深圳Yolo 视觉
深圳
立即加入
词汇表

Token(令牌)

Explore how tokens act as the atomic units of AI processing. Learn how the [Ultralytics Platform](https://platform.ultralytics.com) uses tokens for NLP and computer vision.

In the sophisticated architecture of modern artificial intelligence, a token represents the fundamental, atomic unit of information that a model processes. Before an algorithm can interpret a sentence, analyze a software script, or recognize objects in an image, the raw input data must be broken down into these discrete, standardized elements. This segmentation is a pivotal step in data preprocessing, transforming unstructured inputs into a numerical format that neural networks can efficiently compute. While humans perceive language as a continuous stream of thoughts or images as seamless visual scenes, computational models require these granular building blocks to perform operations like pattern recognition and semantic analysis.

Token vs. Token化

To grasp the mechanics of machine learning, it is essential to distinguish between the data unit and the process used to create it. This differentiation prevents confusion when designing data pipelines and preparing training material on the Ultralytics Platform.

  • Tokenization: This is the algorithmic process (the verb) of splitting raw data into pieces. For text, this might involve using libraries like the Natural Language Toolkit (NLTK) to determine where one unit ends and another begins.
  • Token: This is the resulting output (the noun). It is the actual chunk of data—such as a word, a subword, or an image patch—that is eventually mapped to a numerical vector known as an embedding.

不同人工智能领域的代币

The nature of a token varies significantly depending on the modality of the data being processed, particularly between textual and visual domains.

NLP 中的文本标记

In the field of Natural Language Processing (NLP), tokens are the inputs for Large Language Models (LLMs). Early approaches mapped strictly to whole words, but modern architectures utilize subword algorithms like Byte Pair Encoding (BPE). This method allows models to handle rare words by breaking them into meaningful syllables, balancing vocabulary size with semantic coverage. For instance, the word "unhappiness" might be tokenized into "un", "happi", and "ness".

计算机视觉中的视觉标记

The concept of tokenization has expanded into computer vision with the advent of the Vision Transformer (ViT). Unlike traditional convolutional networks that process pixels in sliding windows, Transformers divide an image into a grid of fixed-size patches (e.g., 16x16 pixels). Each patch is flattened and treated as a distinct visual token. This approach enables the model to use self-attention mechanisms to understand the relationship between distant parts of an image, similar to how Google Research originally applied transformers to text.

实际应用

在无数应用场景中,代币充当了人类数据与机器智能之间的桥梁。

  1. 开放词汇对象检测:先进模型YOLO多模态方法,其中文本标记与视觉特征相互作用。用户可输入自定义文本提示(例如"蓝色头盔"),模型将其标记化后与图像中的对象进行匹配。这实现了零样本学习,能够检测模型未经过显式训练的对象。
  2. 生成式人工智能:在聊天机器人等文本生成系统中,人工智能通过预测序列中下一个符号的出现概率来运作。通过反复选择最可能出现的后续符号,系统构建出连贯的句子和段落,为从自动化客户支持到虚拟助手的各类工具提供支持。

Python :使用文本令牌进行检测

以下代码片段演示了如何 ultralytics package uses text tokens to guide 物体检测. While the state-of-the-art YOLO26 is recommended for high-speed, fixed-class inference, the YOLO-World architecture uniquely allows users to define classes as text tokens at runtime.

from ultralytics import YOLO

# Load a pre-trained YOLO-World model capable of understanding text tokens
model = YOLO("yolov8s-world.pt")

# Define specific classes; these text strings are tokenized internally
# The model will look specifically for these "tokens" in the visual data
model.set_classes(["bus", "backpack"])

# Run prediction on an image using the defined tokens
results = model.predict("https://ultralytics.com/images/bus.jpg")

# Display the results showing only the tokenized classes
results[0].show()

Understanding tokens is fundamental to navigating the landscape of generative AI and advanced analytics. Whether enabling a chatbot to converse fluently or helping a vision system distinguish between subtle object classes, tokens remain the essential currency of machine intelligence used by frameworks like PyTorch and TensorFlow.

加入Ultralytics 社区

加入人工智能的未来。与全球创新者联系、协作和共同成长

立即加入