QUICK REVIEW

[论文解读] Unifying the Dropout Family Through Structured Shrinkage Priors

Eric Nalisnick, Padhraic Smyth|arXiv (Cornell University)|Jan 1, 2018

Adversarial Robustness in Machine Learning被引用 1

一句话总结

该论文通过精确重参数化，利用结构化收缩先验统一了Dropout及其他乘法噪声方法，表明Dropout的训练目标近似于边缘MAP估计。该研究提出了ResNet的‘自动深度确定’机制，通过改进的推理策略在回归基准上提升了泛化性能。

ABSTRACT

Dropout regularization of deep neural networks has been a mysterious yet effective tool to prevent overfitting. Explanations for its success range from the prevention of co-adapted weights to it being a form of cheap Bayesian inference. We propose a novel framework for understanding multiplicative noise in neural networks, considering continuous distributions as well as Bernoulli noise (i.e. dropout). We show that multiplicative noise induces structured shrinkage priors on a network's weights. We derive the equivalence through reparametrization properties of scale mixtures and without invoking any approximations. Given the equivalence, we then show that dropout's Monte Carlo training objective approximates marginal MAP estimation. We leverage these insights to propose a novel shrinkage framework for resnets, terming the prior 'automatic depth determination' as it is the natural analog of automatic relevance determination for network depth. Lastly, we investigate two inference strategies that improve upon the aforementioned MAP approximation in regression benchmarks.

研究动机与目标

为神经网络中乘法噪声（包括Dropout）与结构化收缩先验之间的关系提供一个原则性、精确的理论框架。
在无近似的情况下，建立Dropout的蒙特卡洛训练目标与边缘MAP估计之间的等价性。
为ResNets设计一种新型先验，实现类似自动相关性确定的自动深度确定。
研究超越标准MAP近似的推理策略，以提升回归任务中的性能。

提出的方法

通过尺度混合的重参数化，证明乘法噪声在神经网络权重上诱导出结构化收缩先验。
利用连续分布与伯努利噪声分布的性质，推导出Dropout训练目标与边缘MAP估计之间的精确等价性。
提出一种用于残差网络的结构化先验，通过鼓励整个残差块的剪枝，实现自动深度确定。
引入两种超越标准MAP的推理策略，提升回归任务中的泛化性能。
采用精确重参数化，避免变分近似，确保理论严谨性。
在回归基准上验证该框架，评估其相对于标准Dropout和基于MAP的推理策略的性能提升。

实验结果

研究问题

RQ1如何通过精确重参数化，将神经网络中的乘法噪声正式关联到结构化收缩先验？
RQ2在贝叶斯框架下，Dropout的训练目标在多大程度上近似于边缘MAP估计？
RQ3能否设计一种结构化先验，以实现在残差网络中的自动深度确定？
RQ4超越MAP估计的推理策略是否能在回归基准中带来性能提升？
RQ5将标准Dropout替换为结构化收缩先验，其理论与实证影响是什么？

主要发现

通过尺度混合的精确重参数化，乘法噪声（包括伯努利Dropout）在神经网络权重上诱导出结构化收缩先验。
在所推导的先验框架下，Dropout的蒙特卡洛训练目标在数学上等价于边缘MAP估计。
所提出的结构化先验通过鼓励整个残差块的剪枝，实现了ResNets中的自动深度确定。
新的推理策略在回归基准中优于标准MAP近似，展现出更好的泛化性能。
该框架通过收缩先验，为Dropout及其他乘法噪声方法提供了理论严谨的统一。
实证结果证实，该方法在回归任务中优于标准Dropout和基线MAP推理，性能更优。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。