QUICK REVIEW

[论文解读] CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

Sangdoo Yun, Dongyoon Han|arXiv (Cornell University)|May 13, 2019

Advanced Neural Network Applications参考文献 51被引用 613

一句话总结

CutMix 将一个训练图像的某个补丁替换为来自另一张图像的补丁，并按面积比例混合标签，从而在几乎无额外开销的情况下提升分类和定位性能。

ABSTRACT

Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout remove informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it leads to information loss and inefficiency during training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms the state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on the ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gains in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix improves the model robustness against input corruptions and its out-of-distribution detection performances. Source code and pretrained models are available at https://github.com/clovaai/CutMix-PyTorch .

研究动机与目标

通过区域性 dropout 基于正则化来提升卷积神经网络的泛化和定位能力。
开发一种在保留信息像素的同时，能够学习部分对象视图的数据增强方法。
在图像分类、弱监督定位和迁移学习任务中展示 CutMix 的有效性。
与其他增强方法相比，展示 CutMix 的鲁棒性和不确定性收益。

提出的方法

通过使用二进制掩码将两张训练图像组合来生成一个新样本，并使用混合标签。
从 Beta(alpha, alpha) 分布中采样混合比 lambda（实验中 alpha=1）。
从一张图像裁剪一个区域并在一个矩形边界框内粘贴到另一张图像中，其面积与 1 - lambda 成正比。
使用 CutMixed 图像和混合标签，按原始损失函数进行训练。
可选择在输入图像层或在更高的特征层应用 CutMix（消融研究）。
在训练成本方面保持最低，除标准数据增强外无额外计算开销。

实验结果

研究问题

RQ1CutMix 是否在如 ImageNet 这样的大规模数据集上相较于 Mixup 和 Cutout 提高分类准确率？
RQ2CutMix 是否通过促使模型关注更广的对象区域来增强弱监督定位？
RQ3使用 CutMix 预训练的模型在后续任务如对象检测和图像描述生成中转移效果是否更好？
RQ4CutMix 在对抗攻击或分布外条件下是否提高鲁棒性与校准/不确定性处理能力？

主要发现

ImageNet：CutMix 在 Top-1 准确率上比基线提高 +2.28%（ResNet-50）和 +1.70%（ResNet-101）。
CIFAR-100：CutMix 在 Top-1 错误率上达到 14.47%（基线 16.45% 与 PyramidNet-200 相比），相对于 Mixup 和 Cutout 获得显著提升。
弱监督定位：CutMix 将 WSOL 准确率提升了 ImageNet 上 +5.4 个百分点，在 ImageNet 定位任务上提升 +0.9，在 CUB200-2011 上也有可观的提升。
迁移学习：用 CutMix 进行预训练可提升下游任务；在 Pascal VOC 目标检测（SSD/Faster R-CNN）和 MS-COCO 图像描述任务中，CutMix 预训练的骨干网提供了可观的增益。
鲁棒性/不确定性：CutMix 显著增强对抗性攻击的鲁棒性（攻击后准确性更高）并改进与 Mixup 和 Cutout 相比的分布外检测指标；它也减少了过度自信的倾向。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。