Glossary

GPU (Graphics Processing Unit)

Discover how GPUs revolutionize AI and machine learning by accelerating deep learning, optimizing workflows, and enabling real-world applications.

Train YOLO models simply
with Ultralytics HUB

Learn more

A Graphics Processing Unit (GPU) is a specialized electronic circuit initially designed to accelerate the creation and rendering of images, video, and animations. While GPUs originated in the gaming and graphics design industries, their unique architecture has made them indispensable tools in modern Artificial Intelligence (AI) and Machine Learning (ML). GPUs possess thousands of processing cores that work in parallel, allowing them to handle massive amounts of calculations simultaneously. This capability is exceptionally well-suited for the computationally demanding tasks found in deep learning algorithms, enabling faster model training and efficient real-time inference. Understanding the evolution of the GPU provides context for its current role in AI.

Importance in AI and Machine Learning

The parallel processing capabilities of GPUs are a primary catalyst for recent breakthroughs in AI. Training deep neural networks involves processing enormous datasets and performing countless complex mathematical operations, such as matrix multiplications. GPUs excel at these operations, drastically cutting down the time needed to train models compared to traditional Central Processing Units (CPUs). This acceleration empowers researchers and developers in the field of AI development to iterate more quickly, experiment with larger and more complex models, and achieve higher accuracy in tasks like object detection and image segmentation.

Here are two examples of GPUs in real-world AI/ML applications:

  1. Real-Time Object Detection: Ultralytics YOLO models, known for their speed and efficiency, heavily rely on GPUs to perform object detection in real-time for applications like autonomous driving, security surveillance (enhancing security systems), and robotics. The parallel nature of GPUs allows these models to process video frames quickly and accurately identify multiple objects simultaneously. Explore the diverse YOLO11 applications enabled by GPU acceleration.
  2. Large Model Training: Training large language models (LLMs) or complex computer vision models often requires significant computational power, frequently accessed via cloud computing platforms. Services like Ultralytics HUB Cloud Training leverage powerful GPU clusters from providers like AWS, Google Cloud, and Azure to train models on vast datasets for tasks ranging from natural language processing (NLP) to advanced medical image analysis.

Key Differences From CPUs and TPUs

While GPUs, CPUs, and Tensor Processing Units (TPUs) are all types of processors, they have different strengths and are optimized for different tasks:

  • CPU (Central Processing Unit): Designed for general-purpose computing, excelling at sequential tasks and managing system operations. CPUs have a few powerful cores optimized for low latency. See a CPU vs GPU overview.
  • GPU (Graphics Processing Unit): Optimized for parallel operations with thousands of simpler cores. Ideal for tasks that can be broken down and processed simultaneously, like rendering graphics and training deep learning models. Measured performance often involves metrics like FLOPS.
  • TPU (Tensor Processing Unit): Google's custom-designed Application-Specific Integrated Circuit (ASIC) specifically built to accelerate machine learning workloads using the TensorFlow framework. They are highly optimized for large-scale matrix operations common in neural networks. Learn more from Google's TPU details.

GPUs strike a balance between high performance for parallel processing tasks and versatility across various applications, making them a popular choice for many AI and high-performance computing (HPC) workloads.

Ecosystem and Usage

The widespread adoption of GPUs in AI is supported by robust software ecosystems. Major manufacturers like NVIDIA and AMD provide GPUs suitable for AI tasks. NVIDIA's CUDA (Compute Unified Device Architecture) platform is a widely used parallel computing platform and programming model for NVIDIA GPUs. Deep learning frameworks such as PyTorch and TensorFlow are optimized to leverage GPU acceleration. Setting up environments for GPU-accelerated development can be streamlined using containerization tools like Docker; refer to the Ultralytics Docker Quickstart guide for setup instructions. Efficient model deployment often involves optimizing models to run effectively on target GPU hardware. Explore various Ultralytics Solutions that leverage GPU power.

Read all