[论文解读] Evidential Deep Learning to Quantify Classification Uncertainty
本论文提出 evidential deep learning (EDL),通过在类别概率上放置 Dirichlet 分布来建模预测不确定性,从而实现显式的不确定性估计,并在与分布外数据和对抗性攻击的鲁棒性上表现出色。它在保持准确性的同时提供经过校准的不确定性估计,优于若干贝叶斯和集成基线。
Deterministic neural nets have been shown to learn effective predictors on a wide range of machine learning problems. However, as the standard approach is to train the network to minimize a prediction loss, the resultant model remains ignorant to its prediction confidence. Orthogonally to Bayesian neural nets that indirectly infer prediction uncertainty through weight uncertainties, we propose explicit modeling of the same using the theory of subjective logic. By placing a Dirichlet distribution on the class probabilities, we treat predictions of a neural net as subjective opinions and learn the function that collects the evidence leading to these opinions by a deterministic neural net from data. The resultant predictor for a multi-class classification problem is another Dirichlet distribution whose parameters are set by the continuous output of a neural net. We provide a preliminary analysis on how the peculiarities of our new loss function drive improved uncertainty estimation. We observe that our method achieves unprecedented success on detection of out-of-distribution queries and endurance against adversarial perturbations.
研究动机与目标
- Motivate robust uncertainty estimation for classifiers beyond standard softmax probabilities.
- Introduce a Dirichlet-based evidential framework to represent predictions as distributions over softmax outputs.
- Develop a learning loss that jointly fits data and controls uncertainty through evidential parameters.
- Regularize predictions to avoid overconfident wrong predictions via KL-divergence to an 'I do not know' state.
提出的方法
- Replace softmax with a non-negative evidence vector that parametrizes a Dirichlet distribution.
- Define Dirichlet parameters alpha_i = f(x_i|Theta) + 1 where f is a neural network output.
- Use a loss based on the L2-type objective: L_i(Theta) = sum_j (y_ij - E[p_ij])^2 + Var(p_ij), which expands to a per-class term with alpha and S = sum alpha.
- Incorporate a KL divergence term to push predictions toward the uniform Dirichlet when there is insufficient evidence (I do not know), with annealing over training epochs.
- Train with standard backpropagation using a LeNet-like architecture and Adam optimizer.
- Compare against L2, Dropout, Deep Ensemble, and variational Bayesian nets on MNIST and CIFAR-10 variants.
实验结果
研究问题
- RQ1Can Dirichlet-distributed predictions provide reliable epistemic uncertainty for neural classifiers?
- RQ2How does evidential learning perform in terms of out-of-distribution detection and adversarial robustness compared to Bayesian and ensemble methods?
- RQ3Does the proposed loss effectively balance data fit with uncertainty calibration?
主要发现
| Method | MNIST | CIFAR5 |
|---|---|---|
| L2 | 99.4 | 76 |
| Dropout | 99.5 | 84 |
| Deep Ensemble | 99.3 | 79 |
| FFGU | 99.1 | 78 |
| FFLU | 99.1 | 77 |
| MNFG | 99.3 | 84 |
| EDL | 99.3 | 83 |
- The proposed EDL method yields competitive accuracy on MNIST and CIFAR-5 while providing explicit uncertainty estimates.
- EDL better detects notMNIST (out-of-distribution) with higher uncertainty and lower confidence compared to baselines.
- Under adversarial perturbations, EDL maintains higher uncertainty for incorrect predictions and achieves a favorable accuracy-uncertainty trade-off versus alternatives.
- The Dirichlet-based approach produces predictive distributions that reflect input distribution shifts more accurately than standard softmax-based models.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。