QUICK REVIEW

[论文解读] Cautious Deep Learning

Yotam Hechtlinger, Barnabás Póczos|arXiv (Cornell University)|May 24, 2018

Adversarial Robustness in Machine Learning参考文献 18被引用 28

一句话总结

本文提出一种基于 $p(x|y)$ 而非 $p(y|x)$ 的谨慎深度学习框架，结合归纳预测，实现具有保证覆盖率的集合预测。当输入为分布外数据时，该方法输出空集，降低过自信程度，提升对分布偏移和对抗攻击的鲁棒性，已在 ImageNet、CelebA 和 IMDB-Wiki 数据集上通过深度特征验证。

ABSTRACT

Most classifiers operate by selecting the maximum of an estimate of the conditional distribution $p(y|x)$ where $x$ stands for the features of the instance to be classified and $y$ denotes its label. This often results in a {\em hubristic bias}: overconfidence in the assignment of a definite label. Usually, the observations are concentrated on a small volume but the classifier provides definite predictions for the entire space. We propose constructing conformal prediction sets which contain a set of labels rather than a single label. These conformal prediction sets contain the true label with probability $1-α$. Our construction is based on $p(x|y)$ rather than $p(y|x)$ which results in a classifier that is very cautious: it outputs the null set --- meaning "I don't know" --- when the object does not resemble the training examples. An important property of our approach is that adversarial attacks are likely to be predicted as the null set or would also include the true label. We demonstrate the performance on the ImageNet ILSVRC dataset and the CelebA and IMDB-Wiki facial datasets using high dimensional features obtained from state of the art convolutional neural networks.

研究动机与目标

解决标准深度学习分类器对模糊或分布外输入仍分配确定标签所导致的过自信问题。
开发一种方法，通过集合预测区分模糊性（多个合理标签）与离群性（无一致标签）。
在不依赖分布假设的前提下，提供分布无关的置信度保证 $P(Y \in C(X)) \geq 1 - \alpha$，同时保持类别增删的灵活性。
通过空集检测分布外输入，提升对分布偏移和对抗攻击的鲁棒性。
利用最先进 CNN 模型的特征，在大规模、高维数据上实现实际部署。

提出的方法

基于每类估计的似然 $\widehat{p}(x|y)$ 构建预测集合 $C(x) = \{ y : \widehat{p}(x|y) > \widehat{t}_y \}$。
通过归纳预测确定阈值 $\widehat{t}_y$，以确保对任意分布均有 $P(Y \in C(X)) \geq 1 - \alpha$。
对每类独立估计 $\widehat{p}(x|y)$，实现模块化类别增删而无需重新训练。
使用预训练卷积神经网络的高维特征作为似然估计的输入。
采用核密度估计或类似非参数方法，为每类建模 $\widehat{p}(x|y)$。
采用用户定义的置信水平 $\alpha$ 调节预测准确率与空预测比例之间的权衡。

实验结果

研究问题

RQ1基于 $p(x|y)$ 的深度学习分类器是否能提供优于标准 $p(y|x)$ 方法的不确定性量化？
RQ2在分布偏移场景下（如从 CelebA 到 IMDB-Wiki 人脸数据集），该方法表现如何？
RQ3该方法在多大程度上能通过将对抗样本或分布外输入分配为空集来实现检测？
RQ4该方法在大规模数据集（如 ImageNet）上是否仍能保持有效的覆盖率保证？
RQ5在不同置信水平 $\alpha$ 下，准确率与空预测率之间的权衡关系如何？

主要发现

在 CelebA 数据集上，该方法的准确率接近 $1 - \alpha$，空预测比例约等于 $\alpha$，表明覆盖率有效。
在 IMDB-Wiki 数据集上，标准分类器表现不佳（准确率 0.577），而该归纳方法在低 $\alpha$ 下显著优于随机猜测。
该方法成功检测到分布偏移：在 CelebA 上训练后测试 IMDB-Wiki 数据，产生高比例空预测，表明输入为分布外。
当 $1 - \alpha$ 设定较低时，误报比例最小化，用户可依据应用需求控制错误类型。
即使训练与测试数据分布显著不同（如 CelebA 与 IMDB-Wiki 在面部姿态和图像质量上的差异），该方法仍保持鲁棒性。
该方法支持灵活的类别管理——可无需重新训练完整模型即可增删类别，同时保持覆盖率保证。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。