系列データのためのリカレントニューラルネットワーク(RNN)のパワーをご覧ください。NLPから時系列分析まで。今日の主要な概念と応用事例を学びましょう。
A Recurrent Neural Network (RNN) is a type of artificial neural network specifically designed to recognize patterns in sequences of data, such as text, genomes, handwriting, or the spoken word. Unlike traditional feedforward networks that assume all inputs (and outputs) are independent of each other, RNNs retain a form of memory. This internal memory allows them to process inputs with an understanding of previous information, making them uniquely suited for tasks where context and temporal order are critical. This architecture mimics how humans process information—reading a sentence, for instance, requires remembering previous words to understand the current one.
The core innovation of an RNN is its loop structure. In a standard feedforward network, information flows in only one direction: from input to output. In contrast, an RNN has a feedback loop that allows information to persist. As the network processes a sequence, it maintains a "hidden state"—a vector that acts as the network's short-term memory. At each time step, the RNN takes the current input and the previous hidden state to produce an output and update the hidden state for the next step.
This sequential processing capability is essential for Natural Language Processing (NLP) and time series analysis. However, standard RNNs often struggle with long sequences due to the vanishing gradient problem, where the network forgets earlier inputs as the sequence grows. This limitation led to the development of more advanced variants like Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs), which introduce mechanisms to better regulate information flow over longer periods.
Recurrent Neural Networks have transformed many industries by enabling machines to understand sequential data. Here are two prominent examples:
It is helpful to distinguish RNNs from other major architectures. A Convolutional Neural Network (CNN) is primarily designed for spatial data, such as images, processing pixel grids to identify shapes and objects. For instance, Ultralytics YOLO26 uses a powerful CNN backbone for real-time object detection. While a CNN excels at "what is in this image?", an RNN excels at "what happens next in this video?".
More recently, the Transformer architecture has largely superseded RNNs for many complex NLP tasks. Transformers use an attention mechanism to process entire sequences in parallel rather than sequentially. However, RNNs remain highly efficient for specific low-latency, resource-constrained streaming applications and are simpler to deploy on edge devices for simple time-series forecasting.
While modern computer vision tasks often rely on CNNs, hybrid models may use RNNs to analyze temporal features extracted from video frames. Below is a simple, runnable example using PyTorch to create a basic RNN layer that processes a sequence of data.
import torch
import torch.nn as nn
# Define a basic RNN layer
# input_size: number of features in the input (e.g., 10 features per time step)
# hidden_size: number of features in the hidden state (memory)
# batch_first: input shape will be (batch, seq, feature)
rnn = nn.RNN(input_size=10, hidden_size=20, num_layers=1, batch_first=True)
# Create a dummy input: Batch size 1, Sequence length 5, Features 10
input_seq = torch.randn(1, 5, 10)
# Forward pass through the RNN
# output contains the hidden state for every time step
# hn contains the final hidden state
output, hn = rnn(input_seq)
print(f"Output shape: {output.shape}") # Expected: torch.Size([1, 5, 20])
print(f"Final hidden state shape: {hn.shape}") # Expected: torch.Size([1, 1, 20])
Despite their utility, RNNs face computational hurdles. Sequential processing inhibits parallelization, making training slower compared to Transformers on GPUs. Furthermore, managing the exploding gradient problem requires careful hyperparameter tuning and techniques like gradient clipping.
Nevertheless, RNNs remain a fundamental concept in Deep Learning (DL). They are integral to understanding the evolution of Artificial Intelligence (AI) and are still widely used in simple anomaly detection systems for IoT sensors. For developers building complex pipelines—such as combining vision models with sequence predictors—managing datasets and training workflows is crucial. The Ultralytics Platform simplifies this process, offering tools to manage data and deploy models efficiently across various environments.