QUICK REVIEW

[论文解读] Verifying Neural Networks with Mixed Integer Programming

Vincent Tjeng, Russ Tedrake|arXiv (Cornell University)|Nov 20, 2017

Adversarial Robustness in Machine Learning参考文献 35被引用 105

一句话总结

本文提出使用混合整数规划（MIP）来验证分段仿射神经网络（如具有ReLU和最大池化层的网络）的鲁棒性，从而对对抗性样本提供可证明的保证。实验表明，基于MIP的验证速度比先前方法快达十倍，并识别出对自然扰动（如模糊）具有鲁棒性的输入，其中一些图像可被证明对所有模糊变体均免疫。

ABSTRACT

Neural networks have demonstrated considerable success in a wide variety of real-world problems. However, the presence of adversarial examples - slightly perturbed inputs that are misclassified with high confidence - limits our ability to guarantee performance for these networks in safety-critical applications. We demonstrate that, for networks that are piecewise affine (for example, deep networks with ReLU and maxpool units), proving no adversarial example exists - or finding the closest example if one does exist - can be naturally formulated as solving a mixed integer program. Solves for a fully-connected MNIST classifier with three hidden layers can be completed an order of magnitude faster than those of the best existing approach. To address the concern that adversarial examples are irrelevant because pixel-wise attacks are unlikely to happen in natural images, we search for adversaries over a natural class of perturbations written as convolutions with an adversarial blurring kernel. When searching over blurred images, we find that as opposed to pixelwise attacks, some misclassifications are impossible. Even more interestingly, a small fraction of input images are provably robust to blurs: every blurred version of the input is classified with the same, correct label.

研究动机与目标

解决在安全关键应用中，针对对抗性样本的深度神经网络鲁棒性验证挑战。
开发一种方法，可可证明地判断给定网络和输入是否存在任何对抗性样本。
将验证范围从逐像素扰动扩展到更自然的图像失真，如模糊。
识别出对整个自然扰动类别（而不仅仅是微小的逐像素变化）具有可证明鲁棒性的输入。

提出的方法

将ReLU和最大池化网络的鲁棒性验证问题建模为混合整数规划（MIP）。
使用MIP在给定扰动半径内证明不存在对抗性样本，或找到最近的此类样本。
将MIP公式应用于具有三层隐藏层的全连接MNIST分类器，实现比先前方法显著更快的求解速度。
将对抗性样本的搜索范围从逐像素扰动扩展至包含对抗性模糊核的卷积操作。
使用MIP在所有模糊输入空间上验证鲁棒性，识别出在所有此类扰动下仍保持正确分类的输入。

实验结果

研究问题

RQ1混合整数规划能否有效用于验证基于ReLU的神经网络的鲁棒性？
RQ2与现有验证方法相比，基于MIP的验证在速度和可扩展性方面表现如何？
RQ3当考虑自然扰动（如模糊）而非任意逐像素变化时，对抗性样本是否仍具有实际意义？
RQ4我们能否识别出对自身所有模糊版本均具有可证明鲁棒性的输入？
RQ5数据集中有多少比例的输入对自然模糊扰动具有鲁棒性，其条件是什么？

主要发现

对于具有三层隐藏层的全连接MNIST网络，基于MIP的验证速度比最佳现有方法快一个数量级。
在搜索模糊图像时，由于网络结构和扰动特性的共同作用，某些误分类被证明是不可能发生的。
MNIST输入中存在一小部分但非零比例的样本，可被证明对所有模糊版本均具有鲁棒性，即所有模糊变体均被正确分类。
该方法成功识别出在所有与对抗性模糊核卷积的扰动下仍保持正确标签的输入。
结果表明，与无约束的逐像素攻击相比，自然扰动下的对抗性样本更少见且更具结构性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。