Task Vectors

Learn how task vectors enable efficient model merging and behavior steering. Discover how to manipulate Ultralytics YOLO26 weights for zero-shot multi-tasking.

Task vectors represent the specific changes made to a neural network's weights during fine-tuning to achieve a new capability. By subtracting the parameters of a foundational base model from those of a fine-tuned model, researchers can isolate a directional vector in the weight space that encapsulates the learned behavior for that specific task. This approach allows developers to apply simple arithmetic operations on model parameters to steer, modify, or merge model behaviors without requiring additional training compute.

Link to this sectionHow Task Vectors Differentiate From Transfer Learning#

While the concept of transfer learning involves sequentially training a model on a new dataset to adapt its existing knowledge, task vectors operate directly on the model's structural weights post-training. Instead of retraining gradients to learn a new domain, weight space interpolation using task vectors allows practitioners to linearly combine the weight differences from multiple independently trained models. This enables zero-shot model merging, allowing a single model to inherit multiple capabilities simultaneously without the typical computational overhead during training.

Link to this sectionReal-World Applications#

The ability to manipulate deep learning models algebraically has led to several impactful applications across modern AI pipelines:

Multi-Task Model Merging: Engineers can combine a task vector optimized for object detection with another trained for image segmentation. When applied to an Ultralytics YOLO26 base model, this creates a dual-purpose architecture that excels at both tasks simultaneously, preserving the strengths of each original fine-tune.
Machine Unlearning and AI Safety: If a model exhibits biased or dangerous outputs, researchers can calculate a task vector representing that specific unwanted behavior. By subtracting this vector from the model's weights, they can effectively "erase" the behavior, contributing heavily to improved AI safety and robust AI ethics standards.
Domain Adaptation in Computer Vision: When adapting models for specific environments—such as shifting from daytime to nighttime real-time inference—task vectors allow users to scale the magnitude of the adaptation. Applying a fraction of the vector (e.g., a scaling factor of 0.5) can yield a balanced model that performs well in both domains.

Link to this sectionWorking with Task Vectors in PyTorch#

Creating and applying a task vector requires accessing and manipulating the PyTorch state dictionary. The following example demonstrates how to extract a task vector from a fine-tuned YOLO26 model and apply it back to the base model with a specific scaling factor.

from ultralytics import YOLO

# Load the state dictionaries for the base and fine-tuned models
base_weights = YOLO("yolo26n.pt").model.state_dict()
tuned_weights = YOLO("yolo26n-custom.pt").model.state_dict()

# Calculate the task vector (tuned weights minus base weights)
task_vector = {k: tuned_weights[k] - base_weights[k] for k in base_weights.keys()}

# Apply the task vector to the base model using a 0.5 scaling factor
for k in base_weights.keys():
    base_weights[k] += 0.5 * task_vector[k]

Link to this sectionThe Future of Weight Manipulation#

As architectures like large language models and massive vision transformers grow in parameter count, retraining them for every minor adjustment becomes economically unfeasible. Task vectors provide a mathematically elegant alternative for post-training model optimization. By sharing lightweight task vectors instead of entire multi-gigabyte models, the AI community can accelerate open-source collaboration in AI. Once your custom task vectors are refined, utilizing the Ultralytics Platform simplifies the subsequent model deployment and monitoring processes, ensuring your optimized weights translate directly into production-ready endpoints.

Task Vectors

Link to this sectionHow Task Vectors Differentiate From Transfer Learning#

Link to this sectionReal-World Applications#

Link to this sectionWorking with Task Vectors in PyTorch#

Link to this sectionThe Future of Weight Manipulation#

Explore solutions

AI in Robotics

AI in Logistics

AI in Retail

AI in Healthcare

AI in Manufacturing

AI in Automotive

AI in Agriculture

AI in Robotics

AI in Logistics

AI in Retail

AI in Healthcare

AI in Manufacturing

AI in Automotive

AI in Agriculture

AI in Robotics

AI in Logistics

AI in Retail

AI in Healthcare

AI in Manufacturing

AI in Automotive

AI in Agriculture

Let's build the future of AI together!