QUICK REVIEW

[论文解读] Adversarially Robust Generalization Just Requires More Unlabeled Data

Runtian Zhai, Tianle Cai|arXiv (Cornell University)|Jun 3, 2019

Adversarial Robustness in Machine Learning参考文献 48被引用 90

一句话总结

该论文证明通过增加未标注数据可以提升对抗性鲁棒性的一般化，并提出一种利用未标注数据来提升 MNIST 和 CIFAR-10 鲁棒性的半监督对抗训练方法。

ABSTRACT

Neural network robustness has recently been highlighted by the existence of adversarial examples. Many previous works show that the learned networks do not perform well on perturbed test data, and significantly more labeled data is required to achieve adversarially robust generalization. In this paper, we theoretically and empirically show that with just more unlabeled data, we can learn a model with better adversarially robust generalization. The key insight of our results is based on a risk decomposition theorem, in which the expected robust risk is separated into two parts: the stability part which measures the prediction stability in the presence of perturbations, and the accuracy part which evaluates the standard classification accuracy. As the stability part does not depend on any label information, we can optimize this part using unlabeled data. We further prove that for a specific Gaussian mixture problem, adversarially robust generalization can be almost as easy as the standard generalization in supervised learning if a sufficiently large amount of unlabeled data is provided. Inspired by the theoretical findings, we further show that a practical adversarial training algorithm that leverages unlabeled data can improve adversarial robust generalization on MNIST and Cifar-10.

研究动机与目标

激励研究对抗性鲁棒性以及未标注数据在提升泛化中的作用。
将鲁棒风险分解为一个稳定性项（依赖于未标注数据）和一个准确性项（需要标签）。
给出理论结果，包括一般的风险界限和高斯混合模型的示例，表明在足够多的未标注数据下，未标注数据可以使鲁棒性与标准泛化达到等效。
开发并验证一个基于 SSL 的实用对抗训练算法，利用有标签和无标签数据来提升鲁棒泛化。

提出的方法

给出风险分解：RRobust ≤ E_x sup_{x' in B(x)} I(f(x') ≠ f(x)) + ŘHat(f) + Rad_S(F) + 3 sqrt(log(2/δ)/(2n)).
表明第一项仅通过未标注数据通过 P_X 相关，并且可以利用未标注数据来最小化。
证明一个高斯混合模型的情形，在充足的未标注数据下，鲁棒泛化在样本复杂度上等同于标准泛化。
提出算法1（对有标签和无标签数据的广义虚拟对抗训练），通过在有标签鲁棒训练中加入无标签的一致性/鲁棒性项来实现。
将 L1 定义为有标签数据的鲁棒训练，L2 定义为用伪标签在无标签数据上的鲁棒性目标，合并为 LSSL = L1 + λ L2。
给出实用的 SSL 目标，拓展 VAT，使用多步 PGD（k ≥ 7）进行更强的扰动以提升鲁棒泛化。

实验结果

研究问题

RQ1未标注数据是否可以降低对抗性鲁棒泛化对有标签数据的需求？
RQ2如何制定鲁棒性目标以有效利用未标注数据？
RQ3在充足未标注数据下，高斯混合模型是否表现出与有标签数据相同的鲁棒性行为？
RQ4一个利用有标签和无标签数据的实用算法是否能在现实数据集上超过标准对抗训练？

主要发现

一个两项界限表明鲁棒风险被一个稳定性项（依赖于未标注数据）加一个标准的 PAC 风格项所界定。
在高斯混合设置中，充足的未标注数据可以使鲁棒泛化在样本复杂度方面与标准泛化同样容易。
一个实用的 SSL 算法在 MNIST 和 CIFAR-10 上提升了鲁棒测试精度，相较仅使用有标签数据的基线。
在攻击中增加 PGD 步数（k）在使用未标注数据时可以提升鲁棒泛化。
该方法在防御成功率方面更高，且鲁棒性能与在更大有标签数据集上训练的基线相比具有竞争力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。