Meet YOLO26: next-gen vision AI.
Ultralytics
Back to Ultralytics Glossary

SwiGLU

Explore SwiGLU, the advanced activation function used in LLMs and Ultralytics YOLO26. Learn how its gated mechanism improves neural network training and efficiency.

SwiGLU (Swish Gated Linear Unit) is an advanced activation function and neural network architectural block that enhances the traditional Feed-Forward Network (FFN) used in deep machine learning. Combining the smooth, non-monotonic properties of the Swish activation function with a Gated Linear Unit (GLU) mechanism, SwiGLU provides dynamic, data-dependent feature routing. By applying a linear projection to an input, passing one branch through a Swish activation, and multiplying it element-wise with another linear branch, the network gains superior expressive power. This allows modern AI architectures to capture complex, non-linear dependencies far more effectively than standard static layers used in older deep learning models.

Link to this sectionHow SwiGLU Works#

Unlike traditional feed-forward networks that simply map an input to a higher dimension, apply a basic non-linearity, and project it back down, SwiGLU introduces a multiplicative gating mechanism. The input is split into two parameterized projections: a "gate" and a "value." The gate branch is activated using the SiLU / Swish function, which preserves small negative values and ensures smooth, non-zero derivatives almost everywhere. This activated gate is then multiplied element-wise with the value branch. This dynamic filtering allows the neural network to intelligently control information flow, avoiding the "dead neuron" problems common in older architectures while stabilizing the gradient signal during the model training process, a concept widely studied in attention mechanisms.

Link to this sectionDifferentiating SwiGLU from Other Activation Functions#

While standard Activation Functions like ReLU use a fixed threshold to clip negative values to zero, SwiGLU dynamically adjusts activations based on the input data itself. Compared to GELU, which weights inputs by their probability under a Gaussian distribution, SwiGLU specifically leverages parameterized linear layers to learn how to gate information. In essence, SwiGLU is not just an element-wise mathematical calculation; it functions as a comprehensive structural component that often replaces the entire hidden layer mechanism inside a Transformer block. For an extensive comparison of mathematical properties, researchers often refer to comprehensive activation function guides.

Link to this sectionReal-World Applications#

Because of its computational efficiency and significant performance gains, SwiGLU has become a foundational component in modern AI systems.

Link to this sectionImplementing SwiGLU in PyTorch#

For developers building custom networks or adapting vision models for edge devices using the Ultralytics Platform, implementing SwiGLU via the PyTorch documentation is straightforward. (Alternatively, developers in other ecosystems might use TensorFlow implementations). The following concise Python snippet demonstrates a basic SwiGLU module using PyTorch's built-in F.silu function:

import torch
import torch.nn as nn
import torch.nn.functional as F


class SwiGLU(nn.Module):
    def __init__(self, in_features, hidden_features):
        super().__init__()
        # SwiGLU requires two projections: one for the gate, one for the value
        self.gate_proj = nn.Linear(in_features, hidden_features)
        self.value_proj = nn.Linear(in_features, hidden_features)
        self.out_proj = nn.Linear(hidden_features, in_features)

    def forward(self, x):
        # Element-wise multiplication of the SiLU-activated gate and the linear value
        hidden = F.silu(self.gate_proj(x)) * self.value_proj(x)
        return self.out_proj(hidden)


# Example usage with a dummy input tensor
module = SwiGLU(in_features=512, hidden_features=1365)
output = module(torch.randn(1, 512))

This structural approach to activation blocks ensures that cutting-edge neural architectures extract richer representations from complex training data, whether applied to Natural Language Processing (NLP) or real-time spatial analysis. For a deeper understanding of building and accelerating efficient models, developers often refer to the foundational research on original GLU variants on arXiv, Meta's open-source repositories, and PyTorch's optimization documentation to maximize hardware throughput.

Explore solutions

Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more

Let's build the future of AI together!

Begin your journey with the future of machine learning