CutMix

CutMix 데이터 증강 기법이 과적합을 방지하는 원리를 알아보세요. Ultralytics YOLO26 모델을 학습할 때 이를 쉽게 적용하는 방법을 배워보시기 바랍니다.

CutMix is an advanced data augmentation technique used to train robust computer vision models by cutting a rectangular patch from one image and pasting it onto a target image. Unlike simpler augmentations that adjust brightness or rotation, CutMix alters the fundamental composition of a training sample. When the pixels are swapped, the corresponding ground-truth labels are also mixed proportionally to the area of the patch. This helps artificial neural networks learn to identify objects from partial views, forcing the model to rely on multiple features rather than focusing solely on the most discriminative parts of an object. First introduced in a 2019 academic paper, it has become a standard operation in deep learning frameworks to prevent overfitting and improve generalization across large datasets.

Link to this section기법의 작동 방식#

During model training, the algorithm randomly selects a center coordinate and a box size to extract a region from a secondary image. This patch is then overlaid directly onto a primary image within the active batch. If the primary image contained a dog and the secondary contained a cat, the final image would feature a cat patch replacing a portion of the dog. The classification labels are updated using linear interpolation based on the exact patch area—for example, yielding a label of 0.7 dog and 0.3 cat. In object detection tasks, bounding boxes that retain at least a certain percentage (often 10%) of their original area within the pasted region are preserved. This technique is natively supported as a cutmix training hyperparameter in Ultralytics YOLO, allowing practitioners to easily define the probability of this transformation.

Link to this sectionMixUp과 Cutout의 차이점#

CutMix는 다른 두 가지 주요 데이터 증강 기법과 밀접하게 관련되어 있지만, 각 기법의 고유한 한계를 해결합니다:

MixUp Augmentation: MixUp은 픽셀 값의 가중 평균을 계산하여 두 이미지를 전역적으로 혼합합니다. 효과적이기는 하지만, 종종 자연스럽지 않고 투명한 유령 이미지 같은 결과를 초래하여 로컬 spatial correlation을 방해함으로써 모델에 혼란을 줄 수 있습니다. 반면, CutMix는 잘라낸 영역 내에서 원래의 픽셀 강도를 유지하며, 이는 Attentive CutMix와 같은 접근 방식에서 연구자들이 더욱 최적화한 부분입니다.
Cutout Augmentation: Cutout은 무작위 사각형 영역을 검은색 픽셀이나 데이터셋 평균값으로 마스킹하여 정보를 제거합니다. 모델이 객체 전체를 보도록 장려하지만, 귀중한 학습 tensors를 낭비하게 됩니다. CutMix는 누락된 공간을 다른 이미지의 유익한 image classification 패치로 대체하여 전반적인 학습 효율성을 높입니다.

Link to this section실제 애플리케이션 사례#

심하게 가려진 객체를 인식하도록 모델을 학습시킴으로써, CutMix는 다양한 산업 전반에서 machine learning 성능을 크게 향상시킵니다.

Automotive AI and Autonomous Driving: 자율주행 차량에서 이 기술은 보행자나 차량이 교통 표지판 등에 의해 부분적으로 가려진 경우에도 시스템이 이를 식별하도록 학습시켜, 혼잡한 환경에서의 안전성을 향상시킵니다.
Medical Diagnostics and Organ Segmentation: 의료 분야에서 이 방법은 organ and tumor segmentation에 널리 사용되어, 해부학적 구조가 겹칠 때도 모델이 복잡한 조직 경계를 인식할 수 있게 합니다.
Remote Sensing for Satellite Imagery: 이 전략은 항공 뷰에서 건물이나 식생과 같이 밀집되고 겹치는 클래스를 보존합니다. 고도로 불균형한 데이터에서 long-tailed recognition 성능을 향상시키기 위한 고급 변형들이 활발히 연구되고 있습니다.

Link to this section실제 구현#

AI 파이프라인에 이 증강 기법을 통합하는 것은 간단합니다. PyTorch Transforms 및 Keras Preprocessing Layers와 같은 대부분의 고수준 라이브러리에서 기본적으로 지원합니다.

YOLO26과 같은 모델을 학습할 때, 이 증강 설정을 구성하는 데는 단 하나의 매개변수 조정만 필요합니다. 이는 이미지 패치 작업과 복잡한 BBox 클리핑 로직을 자동으로 처리합니다.

from ultralytics import YOLO

# Initialize the recommended Ultralytics YOLO26 model
model = YOLO("yolo26n.pt")

# Train the model with CutMix enabled at a 50% probability
results = model.train(data="coco8.yaml", epochs=50, imgsz=640, cutmix=0.5)

대규모 비전 워크플로우를 관리하는 팀의 경우, Ultralytics Platform을 통해 클라우드 인터페이스에서 직접 data augmentation best practices를 조정할 수 있어, 어노테이션에서 model deployment까지의 과정을 간소화합니다.