Meet YOLO26: next-gen vision AI.
Ultralytics
Back to Ultralytics Glossary

Synthetic Data

Discover how synthetic data powers AI and machine learning. Learn how to generate high-quality datasets for Ultralytics YOLO26 to improve model accuracy today.

Synthetic data is artificially generated information that mimics the statistical properties, patterns, and structural characteristics of real-world data. In the rapidly evolving fields of artificial intelligence (AI) and machine learning (ML), this data serves as a critical resource when collecting authentic data is expensive, time-consuming, or restricted by privacy regulations. Unlike organic data harvested from real-world events, synthetic data is algorithmically created using techniques such as computer simulations and advanced generative models. By 2030, industry analysts at Gartner predict that synthetic data will overshadow real data in AI models, fundamentally shifting how intelligent systems are built and deployed.

Link to this sectionThe Role of Synthetic Data in AI Development#

The primary driver for utilizing synthetic datasets is to overcome the limitations inherent in traditional data collection and annotation. Training robust computer vision (CV) models often requires massive datasets containing diverse scenarios. When real-world data is scarce—such as in rare disease diagnosis or dangerous edge-case traffic accidents—synthetic data bridges the gap.

Generating this data allows developers to create perfectly labeled training data on demand. This includes precise bounding boxes for object detection or pixel-perfect masks for semantic segmentation, eliminating the human error often found in manual labeling processes. Furthermore, it addresses bias in AI by allowing engineers to deliberately balance datasets with underrepresented groups or environmental conditions, ensuring fairer model performance.

Link to this sectionReal-World Applications#

Synthetic data is revolutionizing industries where data privacy, safety, and scalability are paramount.

  • Autonomous Driving Simulations: Testing autonomous vehicles solely in the physical world is risky and geographically limited. Companies utilize photorealistic simulators, such as NVIDIA Omniverse, to train their perception systems. These simulators generate billions of virtual miles, exposing the AI to hazardous weather, erratic pedestrian behavior, and complex urban layouts that are difficult to capture consistently in the real world.
  • Healthcare and Medical Imaging: Patient privacy laws like HIPAA and GDPR strictly regulate the sharing of medical records. Synthetic data enables the creation of realistic medical image analysis datasets—such as X-rays or MRI scans—that retain the markers of pathology without containing any personally identifiable information. This allows researchers to train tumor detection models collaboratively without compromising patient confidentiality.

Link to this sectionGenerating Synthetic Data for Vision AI#

Creating high-quality synthetic data often involves two main approaches: simulation engines and generative AI. Simulation engines, like the Unity Engine, use 3D graphics to render scenes with physics-based lighting and textures. Alternatively, generative models, such as Generative Adversarial Networks (GANs) and diffusion models, learn the distribution of real data to synthesize new, photorealistic examples.

Once a synthetic dataset is generated, it can be used to train high-performance models. The following Python example demonstrates how to load a model—potentially trained on synthetic data—using the ultralytics package to perform inference on an image.

from ultralytics import YOLO

# Load the YOLO26 model (latest stable generation for superior accuracy)
model = YOLO("yolo26n.pt")

# Run inference on a source image (this could be a synthetic validation image)
results = model("https://ultralytics.com/images/bus.jpg")

# Display the detection results to verify model performance
results[0].show()

Link to this sectionSynthetic Data vs. Data Augmentation#

It is helpful to distinguish synthetic data from data augmentation, as both techniques aim to expand datasets but function differently.

  • Data Augmentation involves applying transformations—such as flipping, rotation, cropping, or color adjustment—to existing real-world images to create slight variations. It relies on the original data source.
  • Synthetic Data involves the creation of entirely new data instances from scratch using algorithms or simulations. It does not strictly require an original image for every output, allowing for the generation of scenarios that have never been captured by a camera.

Modern workflows on the Ultralytics Platform often combine both approaches: using synthetic data to fill gaps in the dataset and applying data augmentation during training to maximize the robustness of models like YOLO26.

Explore solutions

Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.
Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.
Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.
Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.
Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.
Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.
Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.
Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.
Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.
Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.
Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.
Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.
Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.
Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.
Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.
Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.
Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.
Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.
Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.
Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.
Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.
Learn more

Let's build the future of AI together!

Begin your journey with the future of machine learning