CutMix

CutMixデータ拡張手法がどのように過学習を防ぐのかを解説します。堅牢なUltralytics YOLO26モデルの学習にこの手法を簡単に適用する方法を学びましょう。

CutMix is an advanced data augmentation technique used to train robust computer vision models by cutting a rectangular patch from one image and pasting it onto a target image. Unlike simpler augmentations that adjust brightness or rotation, CutMix alters the fundamental composition of a training sample. When the pixels are swapped, the corresponding ground-truth labels are also mixed proportionally to the area of the patch. This helps artificial neural networks learn to identify objects from partial views, forcing the model to rely on multiple features rather than focusing solely on the most discriminative parts of an object. First introduced in a 2019 academic paper, it has become a standard operation in deep learning frameworks to prevent overfitting and improve generalization across large datasets.

Link to this section本手法の仕組み#

During model training, the algorithm randomly selects a center coordinate and a box size to extract a region from a secondary image. This patch is then overlaid directly onto a primary image within the active batch. If the primary image contained a dog and the secondary contained a cat, the final image would feature a cat patch replacing a portion of the dog. The classification labels are updated using linear interpolation based on the exact patch area—for example, yielding a label of 0.7 dog and 0.3 cat. In object detection tasks, bounding boxes that retain at least a certain percentage (often 10%) of their original area within the pasted region are preserved. This technique is natively supported as a cutmix training hyperparameter in Ultralytics YOLO, allowing practitioners to easily define the probability of this transformation.

Link to this sectionMixUpとCutoutの違い#

CutMixは他の2つの主要なデータ拡張手法と密接に関連していますが、それぞれの特定の制限を解決しています。

MixUp Augmentation: MixUpは、画素値の加重平均を計算することで2枚の画像を全体的にブレンドします。効果的ではありますが、しばしば不自然で半透明な幻影のような画像となり、局所的な空間的相関を乱すことでモデルを混乱させることがあります。対照的に、CutMixはカットされた領域内の元の画素強度を維持します。これは、Attentive CutMixのようなアプローチで研究者がさらに最適化した点です。
Cutout Augmentation: Cutoutは、ランダムな長方形領域を黒画素またはデータセットの平均値でマスキングすることで情報をドロップします。モデルにオブジェクト全体を見るように促す一方で、貴重な学習テンソルを無駄にしてしまいます。CutMixは、その失われた空間を他の画像からの有益な画像分類パッチで置き換え、全体的な学習効率を高めます。

Link to this section実社会での応用#

ひどく遮蔽されたオブジェクトを認識するようにモデルを学習させることで、CutMixはさまざまな業界で機械学習のパフォーマンスを大幅に向上させます。

Automotive AI and Autonomous Driving：自動運転車において、この技術は歩行者や車両が道路標識などで部分的に隠れている場合でもそれらを識別できるようにシステムを学習させ、混雑した環境での安全性を高めます。
医療診断および臓器セグメンテーション: ヘルスケアにおいて、この方法は臓器や腫瘍のセグメンテーションに広く使用されており、解剖学的構造が重なっている場合でもモデルが複雑な組織の境界を認識できるようにします。
衛星画像のリモートセンシング: この戦略は、建物や植生のような高密度で重なり合うクラスを航空写真から維持します。高度なバリエーションが、極端に不均衡なデータに対するロングテール認識を改善するために積極的に研究されています。

Link to this section実践における実装#

このデータ拡張をAIパイプラインに統合するのは簡単です。ほとんどのハイレベルライブラリがネイティブでサポートしており、PyTorch TransformsやKeras Preprocessing Layersなどが挙げられます。

YOLO26のようなモデルを学習させる際、このデータ拡張の設定には単一のパラメータ調整が必要です。これにより、画像パッチの作成と複雑なバウンディングボックスのクリッピングロジックの両方が自動的に処理されます。

from ultralytics import YOLO

# Initialize the recommended Ultralytics YOLO26 model
model = YOLO("yolo26n.pt")

# Train the model with CutMix enabled at a 50% probability
results = model.train(data="coco8.yaml", epochs=50, imgsz=640, cutmix=0.5)

大規模なビジョンワークフローを管理するチームにとって、Ultralytics Platformは、クラウドインターフェースから直接データ拡張のベストプラクティスを調整できるようにすることでこれを簡素化し、アノテーションからモデルデプロイまでのプロセスを効率化します。

Explore solutions

ロボティクスにおけるAI

Ultralytics YOLOモデルで、よりスマートなマシンを実現しましょう。ロボティクスにおけるビジョンAIは、自律航行、認識、物体追跡、リアルタイム制御を推進します。

CutMix

Link to this section本手法の仕組み#

Link to this sectionMixUpとCutoutの違い#

Link to this section実社会での応用#

Link to this section実践における実装#

Explore solutions

ロボティクスにおけるAI

物流におけるAI

小売業界におけるAI

ヘルスケアにおけるAI

製造におけるAI

自動車におけるAI

農業におけるAI

ロボティクスにおけるAI

物流におけるAI

小売業界におけるAI

ヘルスケアにおけるAI

製造におけるAI

自動車におけるAI

農業におけるAI

ロボティクスにおけるAI

物流におけるAI

小売業界におけるAI

ヘルスケアにおけるAI

製造におけるAI

自動車におけるAI

農業におけるAI

AIの未来を共に築き上げましょう！