QUICK REVIEW

[论文解读] Robustness to Adversarial Perturbations in Learning from Incomplete Data

Amir Najafi, Shin‐ichi Maeda|arXiv (Cornell University)|May 24, 2019

Adversarial Robustness in Machine Learning被引用 33

一句话总结

一个半监督分布鲁棒学习框架（SSDRL），将SSL和DRL统一在一起，具备基于SGD的算法和泛化保证，在MNIST、SVHN和CIFAR-10上进行评估。

ABSTRACT

What is the role of unlabeled data in an inference problem, when the presumed underlying distribution is adversarially perturbed? To provide a concrete answer to this question, this paper unifies two major learning frameworks: Semi-Supervised Learning (SSL) and Distributionally Robust Learning (DRL). We develop a generalization theory for our framework based on a number of novel complexity measures, such as an adversarial extension of Rademacher complexity and its semi-supervised analogue. Moreover, our analysis is able to quantify the role of unlabeled data in the generalization under a more general condition compared to the existing theoretical works in SSL. Based on our framework, we also present a hybrid of DRL and EM algorithms that has a guaranteed convergence rate. When implemented with deep neural networks, our method shows a comparable performance to those of the state-of-the-art on a number of real-world benchmark datasets.

研究动机与目标

阐明在对抗性分布变化下，未标注数据如何帮助学习。
开发一个将半监督学习与在Wasserstein模糊集下的分布鲁棒学习相结合的框架。
提供理论保证，包括一种新颖的对抗性Rademacher复杂度和一个泛化界。
提出一个对SSDRL具有收敛保证的优化算法。
在真实数据集上使用深度网络展示具竞争力的经验表现。

提出的方法

通过对标记数据和未标记数据定义一致的分布集合，将DRL扩展到部分标记数据。
采用自学习方案，让未标记数据获得软标签以通过一个参数化的损失项避免过拟合（乐观）或硬标签（悲观）。
将SSDRL目标定义为在Wasserstein球内的分布S的下确界，加上针对未标记数据的正则化熵项（方程5）。
证明内部优化可通过软最小算子得到解析解（定义3和方程6–8）。
给出一个带有收敛保证的随机梯度下降算法（算法1），使SSDRL目标在局部最小点收敛（定理2）。
使用新颖的半监督Monge（SSM）Rademacher复杂度推导泛化界，用以处理带部分标签的对抗性扰动（第2.3节）。

实验结果

研究问题

RQ1在半监督学习中，如何利用未标注数据提升在分布性对抗下的鲁棒性？
RQ2将SSL与Wasserstein扰动下的分布鲁棒学习结合时，理论泛化保证是什么？
RQ3是否能够为深度神经网络中的SSDRL开发具有收敛保证的实用优化算法？
RQ4提出的基于SSAR的目标与现有DRL和SSL方法有何关系，以及在标签分配中的乐观与悲观的影响？

主要发现

SSDRL 将 SSL 与 DRL 整合，在半监督设置下对分布扰动提供鲁棒性。
该框架引入带有对未标记数据的软标签和基于Wasserstein的对抗性风险目标的双重优化。
在给定条件下，所提SGD算法的收敛速率得到保证（O(T^{-1/2})）。
一种新颖的SSM Rademacher复杂度在对抗性半监督学习中提供泛化保证。
在MNIST、SVHN和CIFAR-10上的实证结果显示，SSDRL 与伪标签和有监督的DRL相当或超越，在某些数据集上对VAT具有竞争力的性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。