QUICK REVIEW

[论文解读] Training verified learners with learned verifiers

Krishnamurthy Dvijotham, Sven Gowal|arXiv (Cornell University)|May 25, 2018

Adversarial Robustness in Machine Learning参考文献 22被引用 98

一句话总结

本论文提出 predictor-verifier training (PVT)，联合训练一个 predictor 和一个 verifier 网络，以界定最坏情况的规范违反，从而在 MNIST/SVHN 上实现最先进的经过验证的鲁棒性，在 CIFAR-10 上获得非平凡的界限且训练时间更快。

ABSTRACT

This paper proposes a new algorithmic framework, predictor-verifier training, to train neural networks that are verifiable, i.e., networks that provably satisfy some desired input-output properties. The key idea is to simultaneously train two networks: a predictor network that performs the task at hand,e.g., predicting labels given inputs, and a verifier network that computes a bound on how well the predictor satisfies the properties being verified. Both networks can be trained simultaneously to optimize a weighted combination of the standard data-fitting loss and a term that bounds the maximum violation of the property. Experiments show that not only is the predictor-verifier architecture able to train networks to achieve state of the art verified robustness to adversarial examples with much shorter training times (outperforming previous algorithms on small datasets like MNIST and SVHN), but it can also be scaled to produce the first known (to the best of our knowledge) verifiably robust networks for CIFAR-10.

研究动机与目标

需要在神经网络的可验证鲁棒性方面超越基于经验防御的动机。
提出一个可扩展的框架，联合训练 predictor 与 verifier 以认证规格。
利用基于对偶的验证在训练时不进行逐样优化的情况下界定最坏情况的违反。
通过学习对偶变量在训练示例中摊销验证成本。
展示对更大数据集的可扩展性以及最先进的经过验证的鲁棒性结果。

提出的方法

定义执行任务的 predictor 网络（例如分类）。
定义输出对偶变量以界定规范的最坏情况违反的 verifier 网络。
联合训练两者，使用结合数据拟合与对偶界项的损失（Equation 8）。
使用验证问题的对偶松弛以获得对 predictor 和 verifier 参数可微的上界。
尝试不同的 verifier 架构（Constant、Direct、Backward-Forward）以研究其对验证紧密性和准确性的影响。
通过用学习到的 verifier 取代逐样优化来证明摊销的验证成本。

实验结果

研究问题

RQ1一个神经 verifier 是否能够学习对偶变量以在训练过程中收紧验证界？
RQ2PVT 是否能够在 MNIST/SVHN 之外的数据集上实现可扩展的、可验证鲁棒的模型？
RQ3不同 verifier 架构如何影响经过验证的和名义的准确性以及训练效率？
RQ4PVT 是否能够在 CIFAR-10 上产生非平凡的可验证鲁棒性界限，并与对抗性训练相比有利？

主要发现

PVT 在 MNIST 和 SVHN 的 L_infinity 扰动下达到最先进的经过验证的准确性。
PVT 可扩展到 CIFAR-10，并为该数据集报道了首个非平凡的经过验证的对抗界限。
PVT 的训练速度比以往的经过验证的训练方法快得多（例如达到 MNIST 性能耗时 6 分钟，而对比方法约 5 小时）。
verifier 架构（Direct 与 Backward-Forward）在不同数据集上可以产生有竞争力或更好的经过验证界限（Constant 表现最差）。
PVT 的经过验证鲁棒性优于标准对抗性训练，但可能在名义准确性上有所折衷，表明清洁准确性方面仍有提升空间。
验证时间分析表明 PVT 模型在每样本验证时间适度的情况下实现近似最优界限（例如 15 ms 预算）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。