[论文解读] Tighter risk certificates for neural networks
本文通过基于紧密的 PAC-Bayes 风险界的训练目标,实证研究训练概率神经网络;提出两个新目标,并与经典边界进行比较,以获得非虚假、收敛的风险证书和具有竞争力的测试误差。
This paper presents an empirical study regarding training probabilistic neural networks using training objectives derived from PAC-Bayes bounds. In the context of probabilistic neural networks, the output of training is a probability distribution over network weights. We present two training objectives, used here for the first time in connection with training neural networks. These two training objectives are derived from tight PAC-Bayes bounds. We also re-implement a previously used training objective based on a classical PAC-Bayes bound, to compare the properties of the predictors learned using the different training objectives. We compute risk certificates for the learnt predictors, based on part of the data used to learn the predictors. We further experiment with different types of priors on the weights (both data-free and data-dependent priors) and neural network architectures. Our experiments on MNIST and CIFAR-10 show that our training methods produce competitive test set errors and non-vacuous risk bounds with much tighter values than previous results in the literature, showing promise not only to guide the learning algorithm through bounding the risk but also for model selection. These observations suggest that the methods studied here might be good candidates for self-certified learning, in the sense of using the whole data set for learning a predictor and certifying its risk on any unseen data (from the same distribution as the training data) potentially without the need for holding out test data.
研究动机与目标
- 将 PAC-Bayes 界作为训练目标来研究训练概率神经网络。
- 引入两个基于 PAC-Bayes 的新训练目标,源自紧密界。
- 将新目标与经典的 PAC-Bayes 目标进行比较,以评估预测器质量与风险证书。
- 证明能够在 MNIST 与 CIFAR-10 上计算出紧密且非空洞的风险证书。
提出的方法
- 将神经网络定义为权重的分布,并通过随机梯度下降进行训练。
- 开发两个新目标:f_quad 来源于 PAC-Bayes-二次界,f_lambda 来源于 PAC-Bayes-λ 界。
- 为对比重新实现经典的 PAC-Bayes 目标(f_classic)。
- 使用部分训练数据为学习到的预测器计算风险证书。
- 在无数据先验与数据相关先验以及多种架构上进行实验。
- 将 PAC-Bayes 与反向传播联系起来于 Bayes-by-Backprop,并对比训练策略。
实验结果
研究问题
- RQ1基于 PAC-Bayes 的训练目标是否能够在提供非虚假、紧致风险证书的同时获得具有竞争力的测试误差?
- RQ2所提出的 f_quad 和 f_lambda 目标是否比经典的 PAC-Bayes 目标产生更紧的证书?
- RQ3数据相关先验与无数据先验如何影响风险证书和预测性能?
- RQ4这些方法是否通过使用所有数据进行学习并在未见数据上进行风险证书,实现自我认证学习?
- RQ5网络架构(全连接与卷积)对证书紧致性和准确性的影响是什么?
主要发现
- 所提出的 PBB 训练目标在测试集误差方面与现有方法相竞争。
- 新目标产生的风险证书比经典界的更紧。
- 该方法在 MNIST 和 CIFAR-10 上证明了非空洞的风险界。
- 经典目标的重新实现也能带来改进,表明训练策略对收益有贡献。
- 数据相关先验和不同架构会影响证书的紧致性和预测性能。
- 结果支持通过将预测器学习与在未见数据上进行风险证书来实现自我认证学习。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。