Skip to main content
QUICK REVIEW

[论文解读] Representing inferential uncertainty in deep neural networks through sampling

Patrick McClure, Nikolaus Kriegeskorte|arXiv (Cornell University)|Apr 24, 2017
Adversarial Robustness in Machine Learning被引用 12
一句话总结

本文提出使用结合伯努利丢弃与高斯丢失连接的贝叶斯深度神经网络,以比标准DNN更准确地表示推理不确定性。通过类似spike-and-slab的变分推断采样权重,该方法在MNIST和CIFAR-10上实现了高分类准确率,同时稳健地建模了不确定性,优于标准DNN以及仅使用单元或权重采样的方法。

ABSTRACT

As deep neural networks (DNNs) are applied to increasingly challenging problems, they will need to be able to represent their own uncertainty. Modelling uncertainty is one of the key features of Bayesian methods. Bayesian DNNs that use dropout-based variational distributions and scale to complex tasks have recently been proposed. We evaluate Bayesian DNNs trained with Bernoulli or Gaussian multiplicative masking of either the units (dropout) or the weights (dropconnect). We compare these Bayesian DNNs ability to represent their uncertainty about their outputs through sampling during inference. We tested the calibration of these Bayesian fully connected and convolutional DNNs on two visual inference tasks (MNIST and CIFAR-10). By adding different levels of Gaussian noise to the test images, we assessed how these DNNs represented their uncertainty about regions of input space not covered by the training set. These Bayesian DNNs represented their own uncertainty more accurately than traditional DNNs with a softmax output. We find that sampling of weights, whether Gaussian or Bernoulli, led to more accurate representation of uncertainty compared to sampling of units. However, sampling units using either Gaussian or Bernoulli dropout led to increased convolutional neural network (CNN) classification accuracy. Based on these findings we use both Bernoulli dropout and Gaussian dropconnect concurrently, which approximates the use of a spike-and-slab variational distribution. We find that networks with spike-and-slab sampling combine the advantages of the other methods: they classify with high accuracy and robustly represent the uncertainty of their classifications for all tested architectures.

研究动机与目标

  • 为了改进深度神经网络中的不确定性表征,以适应实际部署场景,其中可靠的置信度估计至关重要。
  • 评估不同的采样策略(单元丢弃与权重丢失连接)对DNN中不确定性校准的影响。
  • 探究结合伯努利与高斯采样机制是否能同时提升准确率与不确定性估计性能。
  • 通过在测试输入中添加噪声,评估不确定性表征在分布偏移下的鲁棒性。
  • 开发一种可扩展至复杂视觉任务的实用贝叶斯深度学习方法,同时保持对不确定性的感知。

提出的方法

  • 在推理过程中使用伯努利丢弃的变分推断来采样网络单元。
  • 在推理过程中应用高斯丢失连接来采样权重,实现随机权重更新。
  • 结合伯努利丢弃与高斯丢失连接,以近似spike-and-slab变分分布。
  • 使用随机梯度下降在MNIST和CIFAR-10上训练贝叶斯全连接与卷积网络,采样操作在前向传播中执行。
  • 通过在不同噪声水平的测试输入下对网络输出进行蒙特卡洛采样来评估不确定性。
  • 通过比较受损测试图像上的预测方差与预测误差,对不确定性进行校准。

实验结果

研究问题

  • RQ1在DNN中,使用权重采样(丢失连接)与单元采样(丢弃)相比,哪种方法在不确定性表征上表现更优?
  • RQ2结合伯努利与高斯采样机制是否能同时提升分类准确率与不确定性估计性能?
  • RQ3当在测试集中加入高斯噪声时,贝叶斯DNN在分布外输入上的不确定性表征能力如何?
  • RQ4类似spike-and-slab的采样策略是否比单一采样方法具有更好的校准性能?
  • RQ5在所提出的不确定性表征框架下,不同架构(全连接与卷积)的表现如何?

主要发现

  • 使用高斯丢失连接采样权重,相比使用伯努利丢弃采样单元,能实现更准确的不确定性表征。
  • 使用伯努利或高斯丢弃采样单元,均提升了CIFAR-10与MNIST上的分类准确率。
  • 结合使用伯努利丢弃与高斯丢失连接,其分类准确率高于单独使用任一方法。
  • 类似spike-and-slab的采样策略在所有测试架构与数据集上均稳健地表征了不确定性。
  • 在分布偏移下,结合采样的贝叶斯DNN在不确定性校准方面优于使用softmax输出的标准DNN。
  • 在测试输入中加入高斯噪声后,所提方法保持了校准良好的不确定性估计,而标准DNN则未能做到。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。