[论文解读] CutPaste: Self-Supervised Learning for Anomaly Detection and Localization
一个基于自监督的 CutPaste 方法从正常数据中学习表征,以检测和定位未知图像缺陷,在不进行异常数据训练的情况下在 MVTec AD 上达到最先进的结果。它还实现了块级定位。
We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data. To this end, we propose a two-stage framework for building anomaly detectors using normal training data only. We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations. We learn representations by classifying normal data from the CutPaste, a simple data augmentation strategy that cuts an image patch and pastes at a random location of a large image. Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects. We bring the improvement upon previous arts by 3.1 AUCs when learning representations from scratch. By transfer learning on pretrained representations on ImageNet, we achieve a new state-of-theart 96.6 AUC. Lastly, we extend the framework to learn and extract representations from patches to allow localizing defective areas without annotations during training.
研究动机与目标
- 仅从正常数据学习缺陷检测方法。
- 引入一个自监督代理任务(CutPaste)以学习对局部不规则性敏感的表征。
- 证明所学表征可实现有效的一类异常检测。
- 扩展到基于块的表征,以在没有异常训练数据的情况下实现定位。
- 在不同缺陷类型下评估鲁棒性并与现有方法进行比较。
提出的方法
- 用二元/三元分类器训练一个基于 CNN 的编码器,以区分正常图像与经过 CutPaste 增强的正常图像。
- 提出 CutPaste 增强:裁剪一个补丁,可能对其进行旋转/抖动,然后将其在随机位置粘贴。
- 可选地使用 CutPaste-Scar 作为细长补丁变体,并训练一个三类别分类器(Normal, CutPaste, CutPaste-Scar)。
- 在学习得到的表征上构建一个生成式的一类探测器,使用高斯密度估计(GDE)对顶层特征进行估计。
- 可选地扩展到补丁级表征:对裁剪的补丁应用 CutPaste,并通过密集补丁打分和接收域上采样产生像素级的异常热图。
- 通过在 MVTec AD 上进行图像级异常检测的 AUC 以及通过 GradCAM 和基于补丁的分数进行像素级定位来进行评估。

实验结果
研究问题
- RQ1CutPaste 基于自监督的学习是否能够学习出对未知真实缺陷具有泛化能力的表征,而无需异常数据进行训练?
- RQ2CutPaste 及其变体与其他增强/自监督任务在缺陷检测上如何比较?
- RQ3是否可以通过补丁级表征在训练阶段没有异常标注的情况下实现对缺陷的精确定位?
- RQ4从 ImageNet 预训练特征进行迁移学习是否能进一步提升缺陷检测性能?
- RQ5在 MVTec AD 的纹理与对象类别上,该方法的鲁棒性如何?
主要发现
| 类别 | DOCC | U-Student | P-SVDD | Rotation | Cutout | Scar | CutPaste | CutPaste (3-way) | Ensemble | |
|---|---|---|---|---|---|---|---|---|---|---|
| texture | 90.6 | 95.3 | 92.9 | 29.7 ± 1.4 | 35.3 ± 2.3 | 92.7 ± 0.4 | 67.9 ± 1.8 | 94.6 ± 0.6 | 93.1 ± 1.1 | 93.9 |
| grid | 52.4 | 98.7 | 94.6 | 60.5 ± 7.0 | 57.5 ± 3.0 | 74.4 ± 2.5 | 99.9 ± 0.1 | 95.5 ± 0.3 | 99.9 ± 0.1 | 100.0 |
| leather | 78.3 | 93.4 | 90.9 | 55.2 ± 1.4 | 67.7 ± 1.5 | 99.9 ± 0.1 | 99.7 ± 0.1 | 100.0 ± 0.0 | 100.0 ± 0.0 | 100.0 ± 0.0 |
| tile | 96.5 | 95.8 | 97.8 | 70.1 ± 1.9 | 71.8 ± 4.0 | 96.7 ± 0.9 | 95.9 ± 1.0 | 89.4 ± 2.8 | 93.4 ± 1.0 | 94.6 |
| wood | 91.6 | 95.5 | 96.5 | 95.8 ± 1.1 | 92.0 ± 0.8 | 98.9 ± 0.2 | 94.9 ± 0.5 | 98.7 ± 0.3 | 98.6 ± 0.5 | 99.1 |
| average | 81.9 | 95.7 | 94.5 | 62.3 ± 2.6 | 64.9 ± 2.3 | 92.5 ± 0.8 | 91.7 ± 0.7 | 95.7 ± 0.8 | 97.0 ± 0.5 | 97.5 |
| object | bottle | 99.6 | 96.7 | 98.6 | 95.0 ± 0.7 | 88.7 ± 0.8 | 98.5 ± 0.2 | 99.2 ± 0.2 | 98.0 ± 0.5 | 98.3 ± 0.5 |
| cable | 90.9 | 82.3 | 90.3 | 85.3 ± 0.8 | 80.2 ± 1.4 | 78.3 ± 1.7 | 87.1 ± 0.8 | 78.8 ± 2.9 | 80.6 ± 0.5 | 81.2 |
| capsule | 91.0 | 92.8 | 76.7 | 71.8 ± 1.4 | 69.5 ± 1.1 | 82.9 ± 0.7 | 87.9 ± 0.7 | 95.3 ± 0.8 | 96.2 ± 0.5 | 98.2 |
| hazelnut | 95.0 | 91.4 | 92.0 | 83.6 ± 0.8 | 69.7 ± 1.3 | 98.9 ± 0.2 | 91.3 ± 0.6 | 96.7 ± 0.4 | 97.3 ± 0.3 | 98.3 |
| metal nut | 85.2 | 94.0 | 94.0 | 72.7 ± 0.5 | 84.6 ± 0.7 | 86.9 ± 1.5 | 96.8 ± 0.5 | 97.9 ± 0.2 | 99.3 ± 0.2 | 99.9 |
| pill | 80.4 | 86.7 | 86.1 | 79.2 ± 1.4 | 78.7 ± 0.7 | 82.2 ± 1.4 | 93.4 ± 0.9 | 85.8 ± 1.3 | 92.4 ± 1.3 | 94.9 |
| screw | 86.9 | 87.4 | 81.3 | 35.8 ± 2.9 | 17.6 ± 4.4 | 11.3 ± 2.2 | 54.4 ± 1.7 | 83.7 ± 0.7 | 86.3 ± 1.0 | 88.7 |
| toothbrush | 96.4 | 98.6 | 100.0 | 99.1 ± 0.2 | 98.1 ± 0.6 | 94.8 ± 1.0 | 99.2 ± 0.2 | 96.7 ± 0.4 | 98.3 ± 0.9 | 99.4 |
| transistor | 90.8 | 83.6 | 91.5 | 88.9 ± 0.4 | 82.5 ± 1.2 | 92.0 ± 0.7 | 96.4 ± 0.7 | 91.1 ± 0.6 | 95.5 ± 0.5 | 96.1 |
| zipper | 92.4 | 95.8 | 97.9 | 74.3 ± 1.6 | 75.7 ± 1.0 | 86.8 ± 0.9 | 99.4 ± 0.1 | 99.5 ± 0.1 | 99.4 ± 0.2 | 99.9 |
- 从头开始训练时,CutPaste 在 MVTec AD 上实现了 95.2 的图像级 AUC,超越先前工作至少 3.1 的 AUC。
- 使用 ImageNet 预训练的骨干网络时,CutPaste 实现了 96.6 的 AUC,刷新了新的状态最佳。
- 基于补丁的表征达到 96.0 的像素级定位 AUC,超过了以往方法。
- 将 5 个 CutPaste(3-way)模型进行集成使图像级 AUC 提高到 96.1。
- CutPaste 变体(CutPaste 和 CutPaste-Scar)在缺陷检测方面优于旋转、Cutout 和 scar 基线。
- 使用 CutPaste 的迁移学习可以进一步提升预训练的 EfficientNet 特征,在不微调时达到 96.6 AUC,有时在微调后亦然。
![Figure 2 : Visualization of (a, green) normal, (b, red) anomaly, and (c–h, blue) augmented normal samples from bottle, toothbrush, screw, grid, and wood classes of MVTec anomaly detection dataset [ 5 ] . Augmented normal samples are generated by baseline augmentations including (c) Cutout and (d) Sc](https://ar5iv.labs.arxiv.org/html/2104.04015/assets/x2.png)
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。