QUICK REVIEW

[论文解读] Exploring the Limits of Out-of-Distribution Detection

Stanislav Fort, Jie Ren|arXiv (Cornell University)|Jun 6, 2021

Anomaly Detection Techniques and Applications参考文献 53被引用 107

一句话总结

该论文表明，大规模预训练变换器，尤其是 Vision Transformers (ViT)，显著提升近OOD检测在视觉与基因组学中的表现，并且少-shot 异常样本暴露与零-shot 多模态线索进一步提升性能。

ABSTRACT

Near out-of-distribution detection (OOD) is a major challenge for deep neural networks. We demonstrate that large-scale pre-trained transformers can significantly improve the state-of-the-art (SOTA) on a range of near OOD tasks across different data modalities. For instance, on CIFAR-100 vs CIFAR-10 OOD detection, we improve the AUROC from 85% (current SOTA) to more than 96% using Vision Transformers pre-trained on ImageNet-21k. On a challenging genomics OOD detection benchmark, we improve the AUROC from 66% to 77% using transformers and unsupervised pre-training. To further improve performance, we explore the few-shot outlier exposure setting where a few examples from outlier classes may be available; we show that pre-trained transformers are particularly well-suited for outlier exposure, and that the AUROC of OOD detection on CIFAR-100 vs CIFAR-10 can be improved to 98.7% with just 1 image per OOD class, and 99.46% with 10 images per OOD class. For multi-modal image-text pre-trained transformers such as CLIP, we explore a new way of using just the names of outlier classes as a sole source of information without any accompanying images, and show that this outperforms previous SOTA on standard vision OOD benchmark tasks.

研究动机与目标

证明大规模预训练变换器在跨模态的近-OOD检测中具有改进作用。
量化在 CIFAR-100 相对于 CIFAR-10 以及基因组学基准上的 OOD 检测提升。
评估微调、不同架构以及自监督预训练对 OOD 性能的影响。
探索少-shot 异常样本暴露作为提升 OOD 检测的实用方法。
通过利用异常类别名称，研究使用 CLIP 等多模态模型的零-shot OOD 检测。

提出的方法

对在 ImageNet-21k 上预训练的 Vision Transformers (ViT) 在 CIFAR-10/CIFAR-100 上进行微调，并评估 MSP 与 Mahalanobis 距离用于 OOD 检测。
将 ViT 与 BiT（基于 ResNet）以及 MLP-Mixer 架构进行比较，以评估架构优势。
使用异常样本暴露来训练一个简单分类器（对监督预训练模型使用线性分类器，对无监督预训练模型使用浅层 MLP），包含同分布数据和少-shot 异常样本。
改变每个类别的异常样本数量（1–10及更多）以研究少-shot OOD 的改进。
应用 CLIP 风格的零-shot OOD 检测，通过将异常类别的名称作为候选文本标签并从图像-文本对齐中测量 OOD 分数。

实验结果

研究问题

RQ1与当前最先进基线相比，大规模预训练变换器在近-OOD 检测方面能带来多大提升？
RQ2微调对同分布数据的影响与仅使用预训练特征进行的 OOD 检测相比有何差异？
RQ3少-shot 异常样本暴露对 CIFAR-100 与 CIFAR-10 以及 CIFAR-10 与 CIFAR-100 任务的 AUROC 有何影响？
RQ4像 CLIP 这样的多模态零-shot 信号是否能在无需带标签的 OOD 图像的情况下改善 OOD 检测？
RQ5无监督预训练（如 DINO）与有监督预训练在 OOD 检测方面相比如何？

主要发现

模型	同分布	微调后测试准确率	离分布	Mahalanobis AUROC	MSP AUROC
BiT-M R50x1	CIFAR-100	87.01%	CIFAR-10	81.71%	81.15%
BiT-M R101x3	CIFAR-100	91.55%	CIFAR-10	90.10%	83.69%
ViT-B_16	CIFAR-100	90.95%	CIFAR-10	95.53%	91.89%
R50+ViT-B_16	CIFAR-100	91.71%	CIFAR-10	96.23%	92.08%
MLP-Mixer-B_16	CIFAR-100	90.40%	CIFAR-10	95.31%	90.22%
BiT-M R50x1	CIFAR-10	97.47%	CIFAR-100	95.52%	85.87%
BiT-M R101x3	CIFAR-10	97.36%	CIFAR-100	94.55%	85.34%
ViT-B_16	CIFAR-10	98.10%	CIFAR-100	98.42%	97.68%
R50+ViT-B_16	CIFAR-10	98.70%	CIFAR-100	98.52%	97.75%
MLP-Mixer-B_16	CIFAR-10	97.58%	CIFAR-100	97.85%	96.28%

对 CIFAR-100 使用 Mahalanobis 距离进行微调的 ViT 在 CIFAR-100 vs CIFAR-10 上实现 AUROC 96%，超出先前 SOTA 的 85%。
预训练 ViT (ImageNet-21k) 在近-OOD 任务上优于 BiT 和 MLP-Mixer 基线。
使用每个类别 1–10 个标注的 OOD 例子进行少-shot 异常样本暴露，在使用经微调的 ViT 特征时，对 CIFAR-100 vs CIFAR-10 的 AUROC 约为 99%。
基因组学 OOD 检测在使用 MSP 与 Mahalanobis 距离的预训练+微调变换器（BERT）后，AUROC 从 66% 提升到 77%。
仅使用异常类别名称的 CLIP 零-shot OOD 检测在 CIFAR-100 vs CIFAR-10 上达到 AUROC 94.8%，且在某些远距离 OOD 任务上接近完美（例如 99.6%/99.9%）。
在基因组学 OOD 中，使用预训练+微调的变换器得到更高的 AUROC（77.49% Mahalanobis，73.53% MSP）以及更好的同分布准确率（89.84%）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。