QUICK REVIEW

[论文解读] Parseval Networks: Improving Robustness to Adversarial Examples

Moustapha Cissé, Piotr Bojanowski|arXiv (Cornell University)|Apr 28, 2017

Anomaly Detection Techniques and Applications被引用 179

一句话总结

Parseval 网络通过保持 Parseval 紧框架和凸聚合来限制逐层的 Lipschitz 常数，以提高对对抗性扰动的鲁棒性，同时保持或提升准确性与训练速度。

ABSTRACT

We introduce Parseval networks, a form of deep neural networks in which the Lipschitz constant of linear, convolutional and aggregation layers is constrained to be smaller than 1. Parseval networks are empirically and theoretically motivated by an analysis of the robustness of the predictions made by deep neural networks when their input is subject to an adversarial perturbation. The most important feature of Parseval networks is to maintain weight matrices of linear and convolutional layers to be (approximately) Parseval tight frames, which are extensions of orthogonal matrices to non-square matrices. We describe how these constraints can be maintained efficiently during SGD. We show that Parseval networks match the state-of-the-art in terms of accuracy on CIFAR-10/100 and Street View House Numbers (SVHN) while being more robust than their vanilla counterpart against adversarial examples. Incidentally, Parseval networks also tend to train faster and make a better usage of the full capacity of the networks.

研究动机与目标

针对输入小扰动（对抗样本）提高深度网络鲁棒性的动机。
引入逐层正则化（Parseval 正则化）以约束 Lipschitz 常数。
开发与 SGD 和常见架构（全连接、卷积、残差）兼容的高效训练程序。
证明 Parseval 网络在提高对抗鲁棒性和训练速度的同时保持具有竞争力的准确性。

提出的方法

通过维持权重矩阵的近似 Parseval 紧框架来将每个隐藏层的 Lipschitz 常数约束为 <= 1。
对于卷积层，将 W 约束为 Parseval 紧框架，输出按 (2k+1)^(-1/2) 重新缩放。
用凸组合的输入替换标准聚合（求和）以通过在简单形上学习的 alpha 保证 Lipschitz 上界。
通过实用正则项 R_beta(W)= (beta/2)||W^T W - I||_2^2 在斯特维尔型流形上优化权重矩阵，并配合高效投影步骤。
在 SGD 更新过程中应用一步回 retract（以及可选的行采样）以保持权重准正交。
为聚合系数使用简单形投影以确保在每个节点的 Lambda_p <= 1。

实验结果

研究问题

RQ1通过 Parseval 正则化约束逐层 Lipschitz 常数是否可以在不牺牲准确性的前提下提高对对抗扰动的鲁棒性？
RQ2如何在全连接、卷积和残差架构中高效地在 SGD 中强制 Parseval 约束？
RQ3将 Parseval 正则化与对抗训练结合对标准图像数据集的鲁棒性有何影响？

主要发现

Model	Clean	ε≈50	ε≈45	ε≈40	ε≈33
CIFAR-10 Vanilla	95.63	90.16	85.97	76.62	67.21
CIFAR-10 Parseval(OC)	95.82	91.85	88.56	78.79	61.38
CIFAR-10 Parseval	96.28	93.03	90.40	81.76	69.10
CIFAR-10 Vanilla	95.49	91.17	88.90	86.75	84.87
CIFAR-10 Parseval(OC)	95.59	92.31	90.00	87.02	85.23
CIFAR-10 Parseval	96.08	92.51	90.05	86.89	84.53
CIFAR-100 Vanilla	79.70	65.76	57.27	44.62	34.49
CIFAR-100 Parseval(OC)	81.07	70.33	63.78	49.97	32.99
CIFAR-100 Parseval	80.72	72.43	66.41	55.41	41.19
CIFAR-100 Vanilla	79.23	67.06	62.53	56.71	51.78
CIFAR-100 Parseval(OC)	80.34	69.27	62.93	53.21	52.60
CIFAR-100 Parseval	80.19	73.41	67.16	58.86	39.56
SVHN Vanilla	98.38	97.04	95.18	92.71	88.11
SVHN Parseval(OC)	97.91	97.55	96.35	93.73	89.09
SVHN Parseval	98.13	97.86	96.19	93.55	88.47

Parseval 训练使权重矩阵的奇异值明显集中在 1 附近，表明近似正交性。
与原生模型相比，Parseval 网络在 CIFAR-10/100 与 SVHN 上仍保持具有竞争力的干净准确性。
Parseval 网络显著提高对对抗样本的鲁棒性，常常优于原生模型，并在若干设置中可与对抗训练相媲美甚至超越。
将 Parseval 正则化与对抗训练结合可以获得最鲁棒的性能，尤其在较高噪声水平下。
Parseval 网络的训练速度往往快于原生模型，并能更充分地利用网络容量。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。