Skip to main content
QUICK REVIEW

[Paper Review] Uncertainty Estimations by Softplus normalization in Bayesian Convolutional Neural Networks with Variational Inference

Kumar Shridhar, Felix Laumann|arXiv (Cornell University)|Jun 15, 2018
Adversarial Robustness in Machine Learning36 references60 citations
TL;DR

This paper proposes Softplus normalization for estimating aleatoric and epistemic uncertainty in Bayesian CNNs trained with variational inference, and shows uncertainty estimates on MNIST, CIFAR-10, and CIFAR-100 across multiple architectures.

ABSTRACT

We introduce a novel uncertainty estimation for classification tasks for Bayesian convolutional neural networks with variational inference. By normalizing the output of a Softplus function in the final layer, we estimate aleatoric and epistemic uncertainty in a coherent manner. The intractable posterior probability distributions over weights are inferred by Bayes by Backprop. Firstly, we demonstrate how this reliable variational inference method can serve as a fundamental construct for various network architectures. On multiple datasets in supervised learning settings (MNIST, CIFAR-10, CIFAR-100), this variational inference method achieves performances equivalent to frequentist inference in identical architectures, while the two desiderata, a measure for uncertainty and regularization are incorporated naturally. Secondly, we examine how our proposed measure for aleatoric and epistemic uncertainties is derived and validate it on the aforementioned datasets.

Motivation & Objective

  • Motivate the need for uncertainty quantification in CNNs to express model confidence and regularize training.
  • Develop a Bayesian CNN framework using Bayes by Backprop with two convolutional operations to learn mean and variance of weights.
  • Introduce Softplus normalization to estimate aleatoric and epistemic uncertainty without Softmax inconsistencies.
  • Demonstrate that variational Bayesian CNNs achieve competitive accuracy with regularization benefits on standard datasets.
  • Provide an empirical analysis of uncertainty estimates across architectures and datasets.

Proposed method

  • Apply Bayes by Backprop to CNNs by approximating the weight posterior with Gaussian variational distributions.
  • Use two sequential convolutional operations to learn mean (μ) and variance (αμ^2) per filter.
  • Employ local reparameterization to sample activations rather than weights for efficiency.
  • Replace Softmax-based uncertainty estimation with Softplus normalization to compute predictive variance.
  • Decompose predictive variance into aleatoric and epistemic components using Monte Carlo sampling over qθ(w|D).
  • Evaluate on LeNet-5, AlexNet, and VGG architectures trained on MNIST, CIFAR-10, and CIFAR-100.

Experimental results

Research questions

  • RQ1Can Bayes by Backprop-based variational CNNs provide competitive classification accuracy compared with frequentist CNNs across standard benchmarks?
  • RQ2How can aleatoric and epistemic uncertainties be estimated coherently in CNNs without Softmax activations in the output layer?
  • RQ3Does Softplus normalization yield robust, well-calibrated predictive uncertainty on image classification benchmarks?
  • RQ4What is the relationship between model accuracy and epistemic uncertainty across different architectures and datasets?
  • RQ5How do dataset characteristics (e.g., MNIST vs CIFAR) influence the estimated aleatoric uncertainty?

Key findings

  • Bayesian CNNs with variational inference achieve validation accuracies comparable to their frequentist counterparts across MNIST, CIFAR-10, and CIFAR-100.
  • Softplus normalization enables uncertainty estimation without introducing Softmax inconsistencies, yielding estimates of aleatoric and epistemic uncertainty.
  • Aleatoric uncertainty remains dataset-dependent and largely constant across models for a given dataset; epistemic uncertainty tends to decrease as validation accuracy improves.
  • Across architectures, higher validation accuracy correlates with lower epistemic uncertainty, illustrating a model-derived source of uncertainty reduction.
  • Softplus normalization produces stable uncertainty estimates under added Gaussian pixel noise, indicating robustness of aleatoric uncertainty to input perturbations.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.