Glossary

CPU

Explore the CPU's vital role in AI & Machine Learning. Learn about its use in data prep, inference, and how it compares to GPUs/TPUs.

A Central Processing Unit (CPU) serves as the primary component of a computer that acts as its control center, executing instructions and orchestrating the flow of data across the system. Often referred to as the "brain" of the device, the CPU handles general-purpose computing tasks, such as running the operating system and managing input/output operations. In the context of artificial intelligence (AI) and machine learning (ML), the CPU plays a foundational role. While it may not offer the massive parallelism required for heavy model training, it is critical for data preprocessing, managing system logic, and executing inference on edge devices where power consumption and hardware costs are constraints.

CPU vs. GPU and TPU

Understanding the hardware landscape is essential for optimizing machine learning operations (MLOps). The CPU differs significantly from accelerators like GPUs and TPUs in architecture and intended use:

CPU (Central Processing Unit): Designed for low-latency, sequential processing. It features a small number of powerful cores capable of handling complex logic and branching. This makes it ideal for tasks like data cleaning and orchestrating the overall pipeline. Major architectures include x86 (Intel, AMD) and ARM.
GPU (Graphics Processing Unit): Originally built for rendering graphics, GPUs utilize thousands of smaller cores to perform parallel processing. This architecture is essential for the matrix operations found in deep learning (DL) and significantly accelerates training large neural networks.
TPU (Tensor Processing Unit): An Application-Specific Integrated Circuit (ASIC) developed by Google specifically for accelerating tensor calculations in frameworks like TensorFlow.

Role in AI Workflows

While GPUs are often the focus for training, the CPU remains indispensable throughout the AI lifecycle.

Data Preprocessing: Before a model can "see" data, images or text must be loaded and transformed. Operations such as resizing, normalization, and data augmentation are typically handled by the CPU using libraries like NumPy and OpenCV. Efficient CPU processing prevents the GPU from sitting idle while waiting for data.
Post-Processing: After a model generates raw predictions, the CPU often performs final calculations. For example, in object detection, the CPU executes Non-Maximum Suppression (NMS) to filter out overlapping bounding boxes and retain the most confident detections.
Edge Inference: In many real-world scenarios, deploying expensive GPUs is not feasible. Edge AI relies heavily on CPUs for running lightweight models on devices like the Raspberry Pi or mobile phones.

Real-World Applications

CPUs facilitate a wide range of applications where versatility and energy efficiency are prioritized over raw throughput.

Smart Surveillance Systems: Many security systems utilize motion detection algorithms running on standard CPUs. By processing video feeds locally on the recording device, the system can trigger alerts or start recording only when activity is detected, saving storage and bandwidth without requiring a dedicated GPU.
Industrial IoT (IIoT): In manufacturing, predictive maintenance systems often run on the embedded CPUs of industrial controllers. These systems monitor sensor data (vibration, temperature) in real-time to predict machinery failure using lightweight regression or classification models, ensuring manufacturing automation runs smoothly.

Running Inference on CPU

Developers frequently use the CPU for debugging, testing, or deploying models in environments lacking specialized hardware. Frameworks like PyTorch allow users to explicitly target the CPU. Furthermore, converting models to formats like ONNX or using the OpenVINO toolkit can significantly optimize inference speeds on Intel CPUs.

The following example demonstrates how to force the Ultralytics YOLO11 model to run inference on the CPU. This is particularly useful for benchmarking performance on standard hardware.

from ultralytics import YOLO

# Load the official YOLO11 nano model
model = YOLO("yolo11n.pt")

# Run inference on an image, explicitly setting the device to CPU
# This bypasses any available GPU to simulate an edge deployment environment
results = model.predict("https://ultralytics.com/images/bus.jpg", device="cpu")

# Display the detection results
results[0].show()

Using the device="cpu" argument ensures that the computation remains on the central processor, allowing developers to verify model compatibility with serverless computing environments or low-power edge devices.

CPU

Train Ultralytics YOLO models to streamline workflows across industries

Flexible enterprise licensing solution to power your innovation

Train AI models in seconds with Ultralytics YOLO

CPU vs. GPU and TPU

Role in AI Workflows

Real-World Applications

Running Inference on CPU

Read more in this category

Self-supervised learning for denoising: A step-by-step breakdown

Future object detection trends: 7 key things to look out for

Enhancing vehicle re-identification with Ultralytics YOLO models

Join the Ultralytics community