Glossary

TinyML

Explore TinyML and learn to deploy Ultralytics YOLO26 on low-power microcontrollers. Discover how to optimize models for IoT with quantization and the Ultralytics Platform.

Tiny machine learning, commonly referred to as TinyML, represents a specialized subfield of machine learning that focuses on deploying models on ultra-low-power, resource-constrained devices like microcontrollers and small IoT devices. Unlike traditional cloud-based systems that rely on immense computational resources, TinyML operates entirely at the edge. By running intelligent algorithms locally on devices with power constraints often measured in mere milliwatts, this approach minimizes latency, ensures data privacy, and drastically reduces bandwidth usage, a paradigm supported and advanced by communities like the TinyML Foundation.

To successfully fit complex neural network architectures onto highly constrained hardware such as ARM Cortex-M processors, models must undergo rigorous optimization. Techniques such as model quantization—which converts 32-bit floating-point weights to 8-bit integers—and model pruning are used to significantly reduce the overall memory footprint. Today, specialized frameworks like Google's TensorFlow Lite for Microcontrollers and PyTorch's ExecuTorch facilitate these precise compression workflows, bringing advanced visual and auditory intelligence to everyday embedded hardware.

TinyML vs. Edge AI

While TinyML is closely related to Edge AI, the primary distinction lies in the hardware scale and power budget. Edge AI is a broader term that encompasses any local execution of AI models, often utilizing single-board computers like a Raspberry Pi or robust embedded GPUs like an NVIDIA Jetson. In contrast, TinyML specifically targets deeply embedded systems that operate on batteries for months or years, such as Arduino boards or STMicroelectronics chips. These devices typically possess only a few hundred kilobytes of RAM, making aggressive model compression mandatory.

Real-World Applications

The ability to deploy intelligence directly onto minimal hardware has unlocked numerous practical use cases across various industries:

Predictive Maintenance in Smart Manufacturing: Factories deploy ultra-low-power vibration and audio sensors directly onto machinery. These TinyML sensors continuously analyze motor frequencies to detect subtle anomalies that indicate impending failure, allowing maintenance teams to address issues before costly downtime occurs.
Smart Precision Agriculture: Battery-operated TinyML devices are scattered across expansive crop fields to monitor localized environmental conditions and detect early signs of pest infestations or disease using basic camera modules, transmitting only lightweight alerts rather than heavy image files.
Wildlife Conservation Audio Monitoring: Researchers use hidden acoustic sensor arrays powered by TinyML to detect the specific sounds of endangered species or illegal logging activities (like chainsaws) in dense forests. Operating on solar or battery power, these devices analyze audio locally and instantly trigger long-range alerts.

Exporting Models for TinyML

Preparing a model for a microcontroller requires strict export formatting. Using Ultralytics YOLO26, developers can build robust object detection pipelines and compress them down for embedded targets. You can manage your dataset and model versioning seamlessly on the Ultralytics Platform before exporting locally. The native TFLite integration allows effortless conversion to the 8-bit integer formats required for microcontrollers, complementing other hardware-specific model deployment options like Apple's CoreML, Google's Edge TPU, and NVIDIA's TensorRT.

The following example demonstrates how to export a lightweight YOLO26 model specifically optimized with INT8 quantization, making it suitable for deployment on TinyML-compatible edge platforms:

from ultralytics import YOLO

# Initialize the lightweight YOLO26 Nano model for edge use cases
model = YOLO("yolo26n.pt")

# Export to TFLite format with INT8 quantization and a reduced image size
# This minimizes the memory footprint and accelerates inference on microcontrollers
model.export(format="tflite", int8=True, imgsz=160)

TinyML

Export to 17+ formats. Deploy to 43 global regions.

Train YOLO26 on H100 GPUs for $2.39/hr.

Flexible enterprise licensing to power your vision AI projects.

Enterprise licensing built to power your next project

Label up to 10x faster with smart annotation

Annotate. Train. Deploy. All in one platform.

TinyML vs. Edge AI

Real-World Applications

Exporting Models for TinyML

Read more in this category

A guide to polygon annotation with Ultralytics Platform

Key highlights from Ultralytics at Hannover Messe 2026 in Germany

Choosing between PyTorch vs TensorFlow for computer vision projects

Let’s build the future of AI together!

TinyML

Export to 17+ formats. Deploy to 43 global regions.

Train YOLO26 on H100 GPUs for $2.39/hr.

Flexible enterprise licensing to power your vision AI projects.

Enterprise licensing built to power your next project

Label up to 10x faster with smart annotation

Annotate. Train. Deploy. All in one platform.

TinyML vs. Edge AI

Real-World Applications

Exporting Models for TinyML

Read more in this category

A guide to polygon annotation with Ultralytics Platform

Key highlights from Ultralytics at Hannover Messe 2026 in Germany

Choosing between PyTorch vs TensorFlow for computer vision projects

Let’s build the future of AI together!

Annotate. Train. Deploy. All in one platform.