[论文解读] A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness
本文提出了 SNGP,一种单模型方法,通过在输出层加入高斯过程实现距离感知,并通过谱归一化强制保留距离关系的表示,从而改进不确定性估计。
Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles. However their practicality in real-time, industrial-scale applications are limited due to the high memory and computational cost. Furthermore, ensembles and BNNs do not necessarily fix all the issues with the underlying member networks. In this work, we study principled approaches to improve uncertainty property of a single network, based on a single, deterministic representation. By formalizing the uncertainty quantification as a minimax learning problem, we first identify distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs with two simple changes: (1) applying spectral normalization to hidden weights to enforce bi-Lipschitz smoothness in representations and (2) replacing the last output layer with a Gaussian process layer. On a suite of vision and language understanding benchmarks, SNGP outperforms other single-model approaches in prediction, calibration and out-of-domain detection. Furthermore, SNGP provides complementary benefits to popular techniques such as deep ensembles and data augmentation, making it a simple and scalable building block for probabilistic deep learning. Code is open-sourced at https://github.com/google/uncertainty-baselines
研究动机与目标
- 在需要安全性敏感的应用中,激发对深度学习中可靠不确定性的需求。
- 将不确定性估计形式化为极小极大问题,并指出距离感知是一个关键的必要条件。
- 引入一种简单、可扩展的方法(SNGP),在确定性深度神经网络中引入距离感知。
- 展示在视觉、语言和基因组任务中使用 SNGP 实现的校准改进和对异常分布的检测提升。
提出的方法
- 定义距离感知并展示其对极小极大最优不确定性估计的必要性。
- 用采用随机傅里叶特征的拉普拉斯近似高斯过程层替换密集输出层。
- 对隐藏层应用谱归一化,以满足双 Lipschitz 条件、保持距离的表示。
- 在随机特征空间中使用拉普拉斯近似以获得闭式、可扩展的GP后验。
- 展示与现有不确定性方法和数据增强的兼容性与互补优势。
实验结果
研究问题
- RQ1如何将距离感知形式化并证明其对高质量不确定性估计的必要性?
- RQ2通过强制距离感知表示和 GP 输出层,单一确定性模型是否能实现具有竞争力的不确定性?
- RQ3SNGP 是否在视觉和语言任务上提高了校准和对分布外数据的检测能力?
- RQ4SNGP 如何与集成方法和数据增强相互作用或互补?
- RQ5该方法是否可扩展到大规模架构和数据集?
主要发现
- SNGP 相对于其他单模型方法在多个基准上在校准和领域外检测方面具有持续改进。
- 谱归一化提升隐藏表示中的距离保持性,促进距离感知。
- 用距离感知的高斯过程替换输出层得到的不确定性会随与训练数据的距离增大而增大。
- 带随机傅里叶特征的拉普拉斯近似GP实现了可扩展、端到端的可确定DNN训练。
- SNGP 为集成与数据增强提供互补优势,使概率深度学习具有可扩展性。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。