Yolo Vision Shenzhen
Shenzhen
Junte-se agora
Glossário

GPU (Unidade de processamento gráfico)

Descubra como as GPUs revolucionam a IA e o aprendizado de máquina, acelerando o aprendizado profundo, otimizando fluxos de trabalho e habilitando aplicações no mundo real.

A Graphics Processing Unit (GPU) is a specialized electronic circuit originally designed to accelerate the manipulation and creation of images in a frame buffer for display output. While their roots lie in rendering computer graphics for gaming and professional visualization, GPUs have evolved into the fundamental engine of modern Artificial Intelligence (AI). Unlike a standard processor that uses a few powerful cores to handle tasks sequentially, a GPU architecture is composed of thousands of smaller, efficient cores designed to handle multiple tasks simultaneously. This capability, known as parallel computing, makes them exceptionally efficient for the massive matrix and vector operations that underpin Deep Learning (DL) and complex Neural Networks (NN).

Accelerating AI Workloads

The primary reason GPUs are indispensable for Machine Learning (ML) is their ability to perform high-speed matrix multiplications. Deep learning frameworks like PyTorch and TensorFlow are specifically optimized to leverage this hardware acceleration. This results in significantly reduced times for model training, often transforming what would be weeks of computation on a standard processor into hours on a GPU. The computational throughput of these devices is typically measured in FLOPS (Floating Point Operations Per Second), a critical metric for gauging the capability of hardware to handle the rigorous demands of state-of-the-art models like YOLO26.

Distinções de hardware: GPU vs. CPU vs. TPU

To understand the hardware landscape, it is helpful to distinguish the GPU from other processing units:

  • CPU (Central Processing Unit): The general-purpose "brain" of a computer. CPUs excel at sequential processing and complex logic branching but are less efficient for the massive parallelism required by large-scale AI training.
  • GPU (Graphics Processing Unit): The industry standard for training and inference. Leading manufacturers like NVIDIA utilize software ecosystems such as CUDA to allow developers to program the GPU directly for general-purpose computing.
  • TPU (Tensor Processing Unit): An Application-Specific Integrated Circuit (ASIC) developed specifically for neural network machine learning. While highly efficient for specific tensor operations, they are less versatile than GPUs for broader computing tasks.

Aplicações no Mundo Real

The implementation of high-performance GPUs has fueled innovations across diverse industries:

  • Autonomous Vehicles: Self-driving cars must process gigabytes of data from cameras, radar, and LiDAR sensors every second. GPUs enable real-time inference, allowing the vehicle's onboard computer to run Object Detection models that identify pedestrians, traffic signs, and obstacles instantaneously.
  • Medical Image Analysis: In healthcare, GPUs accelerate the processing of high-resolution scans such as MRIs and CTs. They enable sophisticated Image Segmentation algorithms to precisely delineate tumors or organs, assisting radiologists in making faster and more accurate diagnoses without relying solely on manual inspection.

Training with GPUs

Ao utilizar o ultralytics package, utilizing a GPU is straightforward and highly recommended for efficient workflows. The library supports automatic device detection, but users can also explicitly specify the device.

O exemplo a seguir demonstra como treinar um modelo YOLO26 na primeira GPU disponível:

from ultralytics import YOLO

# Load the YOLO26n model
model = YOLO("yolo26n.pt")

# Train the model on the first available GPU (device=0)
# This significantly accelerates training compared to CPU usage
results = model.train(data="coco8.yaml", epochs=5, imgsz=640, device=0)

Deployment and Optimization

Beyond training, GPUs play a crucial role in Model Deployment. To maximize efficiency during inference, models are often converted to optimized formats like TensorRT, which restructures the neural network to align perfectly with the specific GPU architecture, reducing latency. For developers who do not have access to high-end local hardware, the Ultralytics Platform offers cloud-based solutions to manage datasets and train models on powerful remote GPU clusters. This accessibility drives innovation in Edge AI, allowing complex Computer Vision (CV) tasks to be deployed on smaller, power-efficient devices in the field.

Junte-se à comunidade Ultralytics

Junte-se ao futuro da IA. Conecte-se, colabore e cresça com inovadores globais

Junte-se agora