QUICK REVIEW

[论文解读] Adaptive, Distribution-Free Prediction Intervals for Deep Networks

Danijel Kivaranovic, Kory D. Johnson|arXiv (Cornell University)|May 25, 2019

Machine Learning and Data Classification参考文献 28被引用 25

一句话总结

本文提出了一种新颖的神经网络框架，通过修改的分位数回归损失函数输出三个值——点估计和预测区间。该方法引入了两种具有有限样本覆盖保证的方法：一种使用推断方法实现平均覆盖保证，另一种提出了一种新的概率近似有效（PAV）覆盖保证以实现条件覆盖，且不损害预测准确性。

ABSTRACT

The machine learning literature contains several constructions for prediction intervals that are intuitively reasonable but ultimately ad-hoc in that they do not come with provable performance guarantees. We present methods from the statistics literature that can be used efficiently with neural networks under minimal assumptions with guaranteed performance. We propose a neural network that outputs three values instead of a single point estimate and optimizes a loss function motivated by the standard quantile regression loss. We provide two prediction interval methods with finite sample coverage guarantees solely under the assumption that the observations are independent and identically distributed. The first method leverages the conformal inference framework and provides average coverage. The second method provides a new, stronger guarantee by conditioning on the observed data. Lastly, our loss function does not compromise the predictive accuracy of the network like other prediction interval methods. We demonstrate the ease of use of our procedures as well as its improvements over other methods on both simulated and real data. As most deep networks can easily be modified by our method to output predictions with valid prediction intervals, its use should become standard practice, much like reporting standard errors along with mean estimates.

研究动机与目标

解决深度神经网络中不确定性量化不足的问题，特别是预测区间的不足。
克服现有方法缺乏可证明覆盖性、依赖强分布假设或降低预测准确性的局限性。
开发一种神经网络架构，可在最小假设下输出点预测和有效预测区间。
在独立同分布采样下提供有限样本覆盖保证，无需参数假设或复杂优化。
证明所提出方法在保持预测准确性的同时，相比最先进方法提升了区间校准性和长度表现。

提出的方法

训练深度神经网络输出三个值：点估计和两个分位数（例如，0.1 和 0.9），使用修改后的分位数回归损失函数。
应用样本分割：使用一部分数据训练网络，另一部分数据校准预测区间。
使用针对神经网络的符合性评分，结合推断方法，实现平均覆盖保证。
提出一种新的概率近似有效（PAV）覆盖准则，基于观测数据进行条件化，提供更强的有限样本有效性。
优化损失函数，确保网络在生成有效区间的同时保持高预测准确性。
在多种数据集（包括表格、图像和时间序列数据）中使用相同的网络架构，以证明其泛化能力。

实验结果

研究问题

RQ1能否修改深度神经网络，使其在最小假设下输出具有可证明有限样本覆盖保证的预测区间？
RQ2如何在生成有效预测区间的同时保持网络的预测准确性？
RQ3能否通过基于观测数据的条件化，实现强于平均覆盖的覆盖保证？
RQ4与贝叶斯神经网络或分位数回归基线等现有方法相比，所提出方法在区间长度和覆盖性方面表现如何？
RQ5该方法在不同数据类型（包括表格、图像和时间序列数据）上的泛化能力如何？

主要发现

所提出的 conf-nn 和 pav 方法在所有数据集和重复实验中均实现了理论保证的近乎精确的平均覆盖性（名义水平为 1−α）。
pav 方法通过基于观测数据的条件化提供了更强的覆盖保证，尽管导致区间略为保守。
未经校准的分位数回归基线（qreg-un）未能实现充分覆盖，证实了通过样本分割进行校准的必要性。
在 Bike Share 和交通数据集上，贝叶斯方法（bayes）产生的区间显著长于 conf-nn 和 pav，某些情况下长度达到两到三倍。
conf-nn 和 pav 在所有有效方法中实现了最短的平均区间长度，优于如 high-q 和 neg-ll 等过于保守的方法。
所提出的损失函数未损害预测准确性，因为 conf-nn 和 pav 的平均绝对误差（MAE）与直接最小化 MAE 的 conf-fw 方法相当。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。