QUICK REVIEW

[论文解读] A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks

Kimin Lee, Kibok Lee|arXiv (Cornell University)|Jul 10, 2018

Adversarial Robustness in Machine Learning被引用 976

一句话总结

论文在 DNN 特征空间中引入基于马氏距离的置信分数，用于检测 OOD 和对抗样本，使用一个预训练的 softmax 分类器，并通过输入预处理和特征集成增强，证明其对鲁棒性与类别增量学习的适用性。

ABSTRACT

Detecting test samples drawn sufficiently far away from the training distribution statistically or adversarially is a fundamental requirement for deploying a good classifier in many real-world machine learning applications. However, deep neural networks with the softmax classifier are known to produce highly overconfident posterior distributions even for such abnormal samples. In this paper, we propose a simple yet effective method for detecting any abnormal samples, which is applicable to any pre-trained softmax neural classifier. We obtain the class conditional Gaussian distributions with respect to (low- and upper-level) features of the deep models under Gaussian discriminant analysis, which result in a confidence score based on the Mahalanobis distance. While most prior methods have been evaluated for detecting either out-of-distribution or adversarial samples, but not both, the proposed method achieves the state-of-the-art performances for both cases in our experiments. Moreover, we found that our proposed method is more robust in harsh cases, e.g., when the training dataset has noisy labels or small number of samples. Finally, we show that the proposed method enjoys broader usage by applying it to class-incremental learning: whenever out-of-distribution samples are detected, our classification rule can incorporate new classes well without further training deep models.

研究动机与目标

为可靠检测远离训练分布或被对抗性扰动的异常测试样本提供动机。
在 DNN 特征空间下提出在高斯判别分析下的简单生成分类器。
在不重新训练预训练的 softmax 分类器的情况下实现检测。
提高对带噪声标签和较小训练样本的鲁棒性。
通过更新类别均值和共享协方差，证明其在类别增量学习中的适用性。

提出的方法

在 DNN 倒数层特征上使用经验均值和汇聚协方差拟合带约束的类条件高斯分布。
定义基于马氏距离的置信分数 M(x) = max_c -(f(x)-mu_c)^T Sigma^{-1}(f(x)-mu_c)。
证明在高斯判别分析下的生成分类器与 softmax 分类器一致，保持准确性。
通过沿着 M(x) 的梯度对 x 进行扰动的输入预处理来提升性能。
通过在多个网络层计算 M(x) 并用逻辑回归学习权重实现特征集成来增强鲁棒性。
通过简单更新规则以适应新类别，在增量学习中更新类别均值和共享协方差。

实验结果

研究问题

RQ1马氏距离基于分数在 DNN 特征空间是否能超越基于 softmax 的置信度用于 OOD 与对抗检测？
RQ2组合多层特征和输入预处理是否在有噪声和数据有限的情况下提升检测鲁棒性？
RQ3同一框架是否可在不重新训练整个模型的情况下支持类别增量学习？
RQ4在仅用分布内数据或 FGSM 对抗数据进行超参数调优时，该方法是否鲁棒？
RQ5该方法在不同数据集和架构上的表现如何（如 CIFAR-10/100、SVHN、ImageNet、LSUN）？

主要发现

基于马氏距离的分数在多数据集上对 OOD 和对抗检测均优于基线 softmax，以及竞争检测器（如 ODIN、LID）。
输入预处理和特征集成显著提升检测性能，包括在 95% TPR 下的高 TNR 和强 AUROC。
该检测器在带噪声标签和小训练集下保持鲁棒，并可仅使用分布内数据或 FGSM 对抗数据进行调参。
通过更新类别均值和共享协方差来适应新类别，支持不重训深度模型的类别增量学习。
在若干对（如 CIFAR-10 与 LSUN/TinyImageNet）的 OOD 检测以及对抗攻击（FGSM、BIM、DeepFool、CW）上实现了方法的最先进结果。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。