Glossary

LoRA (Low-Rank Adaptation)

Discover how LoRA fine-tunes large AI models like YOLO efficiently, reducing costs and enabling edge deployment with minimal resources.

LoRA, or Low-Rank Adaptation, is a highly efficient technique used to adapt large, pre-trained machine learning (ML) models for specific tasks without the need to retrain the entire model. Originally detailed in a paper by Microsoft researchers, LoRA has become a cornerstone of Parameter-Efficient Fine-Tuning (PEFT). It dramatically reduces the computational cost and storage requirements associated with customizing massive models, such as Large Language Models (LLMs) and other foundation models.

How LoRA Works

Instead of updating the billions of model weights in a pre-trained model, LoRA freezes all of them. It then injects a pair of small, trainable matrices—called low-rank adapters—into specific layers of the model, often within the attention mechanism of a Transformer architecture. During the training process, only the parameters of these new, much smaller matrices are updated. The core idea is that the changes needed to adapt the model to a new task can be represented with far fewer parameters than the original model contains. This leverages principles similar to dimensionality reduction to capture the essential information for the adaptation in a compact form. Once training is complete, the small adapter can be merged with the original weights or kept separate for modular task-switching.

Real-World Applications

LoRA's efficiency makes it ideal for a wide range of applications, especially where multiple custom models are needed.

Customizing Chatbots: A business can take a powerful, general-purpose LLM and use LoRA to train it on its internal knowledge base. This creates a specialized customer service chatbot that understands company-specific terminology without the immense cost of full fine-tuning.
AI Art and Style Transfer: Artists and designers use LoRA to adapt generative AI models like Stable Diffusion to a specific artistic style. By training an adapter on a small set of their own images, they can generate new art that mimics their unique aesthetic, a popular practice on platforms like Hugging Face.

LoRA (Low-Rank Adaptation)

Flexible enterprise licensing solution to power your innovation

Train AI models in seconds with Ultralytics YOLO

Train YOLO models simply with Ultralytics HUB

How LoRA Works

Real-World Applications

Read more in this category

Human-in-the-loop machine learning (HITL) explained

Manufacturing automation using vision AI

Industrial Internet of things (IIoT) explained

Join the Ultralytics community

LoRA (Low-Rank Adaptation)

Flexible enterprise licensing solution to power your innovation

Train AI models in seconds with Ultralytics YOLO

Train YOLO models simply with Ultralytics HUB

How LoRA Works

Real-World Applications

LoRA vs. Related Concepts

Read more in this category

Human-in-the-loop machine learning (HITL) explained

Manufacturing automation using vision AI

Industrial Internet of things (IIoT) explained

Join the Ultralytics community