Yolo 深圳
深セン
今すぐ参加
用語集

SiLU(Sigmoid Linear Unit)

SiLU(Swish)活性化関数が、物体検出や自然言語処理(NLP)などのAIタスクにおいて、ディープラーニングのパフォーマンスをどのように向上させるかをご覧ください。

The Sigmoid Linear Unit, commonly referred to as SiLU, is a highly effective activation function used in modern deep learning architectures to introduce non-linearity into neural networks. By determining how neurons process and pass information through the layers of a model, SiLU enables systems to learn complex patterns in data, functioning as a smoother and more sophisticated alternative to traditional step functions. Often associated with the term "Swish" from initial research on automated activation search, SiLU has become a standard in high-performance computer vision models, including the state-of-the-art YOLO26 architecture.

SiLU関数はどのように機能するか

At its core, the SiLU function operates by multiplying an input value by its own Sigmoid transformation. Unlike simple threshold functions that abruptly switch a neuron between "on" and "off," SiLU provides a smooth curve that allows for more nuanced signal processing. This mathematical structure creates distinct characteristics that benefit the model training process:

  • 滑らかさ:曲線は連続的かつ全点で微分可能である。この特性は、 勾配降下法などの 最適化アルゴリズムを支援する。 モデル重みを調整するための一貫した環境を提供し、 これにより学習中の収束が早まることが多い。
  • Non-Monotonicity: Unlike standard linear units, SiLU is non-monotonic, meaning its output can decrease even as the input increases in certain negative ranges. This allows the network to capture complex features and retain negative values that might otherwise be discarded, helping to prevent the vanishing gradient problem in deep networks.
  • 自己ゲート機構:SiLUは自身のゲートとして機能し、入力値の大きさに基づいて通過する入力量を調節する。これは長短期記憶(LSTM)ネットワークに見られるゲート機構を模倣しているが、畳み込みニューラルネットワーク(CNN)に適した計算効率の高い形態で実現されている。

実際のアプリケーション

SiLUは、精度と効率が最優先される多くの最先端AIソリューションに不可欠です。

  • Autonomous Vehicle Perception: In the safety-critical domain of autonomous vehicles, perception systems must identify pedestrians, traffic signs, and obstacles instantly. Models utilizing SiLU in their backbones can maintain high inference speeds while accurately performing object detection in varying lighting conditions, ensuring the vehicle reacts safely to its environment.
  • Medical Imaging Diagnostics: In medical image analysis, neural networks need to discern subtle texture differences in MRI or CT scans. The gradient-preserving nature of SiLU helps these networks learn the fine-grained details necessary for early tumor detection, significantly improving the reliability of automated diagnostic tools used by radiologists.

関連概念との比較

SiLUを十分に理解するには、Ultralytics にある他の活性化関数との違いを区別することが有用です。

  • SiLU vs. ReLU (Rectified Linear Unit): ReLU is famous for its speed and simplicity, outputting zero for all negative inputs. While efficient, this can lead to "dead neurons" that stop learning. SiLU avoids this by allowing a small, non-linear gradient to flow through negative values, which often results in better accuracy for deep architectures trained on the Ultralytics Platform.
  • SiLU vs. GELU (Gaussian Error Linear Unit): These two functions are visually and functionally similar. GELU is the standard for Transformer models like BERT and GPT, while SiLU is frequently preferred for computer vision (CV) tasks and CNN-based object detectors.
  • SiLUとシグモイド関数:SiLUは内部でシグモイド関数を使用するが、両者は異なる役割を担う。シグモイド関数は通常、二値分類における最終出力層で確率を表すために用いられるのに対し、SiLUは隠れ層で特徴抽出を促進するために用いられる。

実施例

You can visualize how different activation functions transform data using the PyTorch library. The following code snippet demonstrates the difference between ReLU (which zeroes out negatives) and SiLU (which allows smooth negative flow).

import torch
import torch.nn as nn

# Input data: negative, zero, and positive values
data = torch.tensor([-2.0, 0.0, 2.0])

# Apply ReLU: Negatives become 0, positives stay unchanged
relu_out = nn.ReLU()(data)
print(f"ReLU: {relu_out}")
# Output: tensor([0., 0., 2.])

# Apply SiLU: Smooth curve, small negative value retained
silu_out = nn.SiLU()(data)
print(f"SiLU: {silu_out}")
# Output: tensor([-0.2384,  0.0000,  1.7616])

By retaining information in negative values and providing a smooth gradient, SiLU plays a pivotal role in the success of modern neural networks. Its adoption in architectures like YOLO26 underscores its importance in achieving state-of-the-art performance across diverse computer vision tasks.

Ultralytics コミュニティに参加する

AIの未来を共に切り開きましょう。グローバルなイノベーターと繋がり、協力し、成長を。

今すぐ参加