Makine öğreniminde FLOP'ları anlayın! Model karmaşıklığını nasıl ölçtüğünü, verimliliği nasıl etkilediğini ve donanım seçimine nasıl yardımcı olduğunu öğrenin.
FLOPs, or Floating Point Operations, is a standard metric used to measure the computational complexity of a machine learning model. It specifically counts the number of mathematical calculations—primarily additions and multiplications involving decimal numbers—that a neural network must perform to process a single input, such as an image or a sentence. In the world of deep learning, FLOPs serves as a theoretical yardstick for estimating how "heavy" or computationally expensive a model is. A higher FLOPs count generally suggests that a model is more complex and will require more processing power and energy to execute, whereas a lower count indicates a lightweight architecture designed for efficiency.
When developing artificial intelligence applications, engineers often face a trade-off between accuracy and speed. FLOPs acts as a hardware-independent proxy for inference latency, allowing developers to compare different architectures without needing to benchmark them on every possible device. This metric is essential for choosing the right model for specific deployment scenarios. For instance, a researcher running experiments on powerful cloud computing servers might prioritize accuracy over efficiency, utilizing models with high FLOPs. Conversely, an engineer building for edge AI devices must prioritize low FLOPs to ensure the application runs smoothly within strict power and thermal limits.
The practical implications of FLOPs are evident across various industries where computational resources are a critical factor.
It is important to distinguish between "FLOPs" (plural of FLOP) and "FLOPS" (all caps). While they look nearly identical, they measure different things. FLOPs (small 's') refers to the total quantity of operations required by a model—it is a static measure of complexity. FLOPS (capital 'S') stands for Floating Point Operations Per Second and measures the speed or performance capability of hardware, such as a GPU. You can think of FLOPs as the distance a car needs to travel (the work to be done), while FLOPS is the top speed of the car (the hardware's ability to do the work).
You can easily calculate the computational cost of an Ultralytics model using Python. This is particularly useful during the model optimization phase to ensure your neural network fits within your hardware budget. The following example demonstrates how to load a YOLO26 model and determine its FLOPs.
from ultralytics import YOLO
from ultralytics.utils.torch_utils import get_flops
# Load a lightweight YOLO26 model
model = YOLO("yolo26n.pt")
# Calculate and print the model's FLOPs (Billions of operations)
# This gives you a hardware-independent complexity metric
flops = get_flops(model)
print(f"Model FLOPs: {flops:.2f} Billion")
To make models more deployable, researchers use several techniques to reduce FLOPs without significantly sacrificing accuracy. Model pruning involves removing less important connections in the neural network, effectively thinning it out. Another technique is quantization, which reduces the precision of the numbers used in calculations (e.g., from 32-bit floating point to 8-bit integers). Tools available on the Ultralytics Platform help streamline these optimization processes, making it easier to deploy efficient models to targets like TensorRT or OpenVINO. By understanding and optimizing FLOPs, developers can build AI systems that are both powerful and sustainable.
