QUICK REVIEW

[论文解读] Learning perturbation sets for robust machine learning

Eric Wong, J. Zico Kolter|arXiv (Cornell University)|Jul 16, 2020

Adversarial Robustness in Machine Learning参考文献 60被引用 40

一句话总结

The paper proposes learning data-driven perturbation sets using conditional VAEs to capture real-world perturbations, proves they satisfy theoretical robustness properties, and demonstrates improved adversarial and certifiable robustness across datasets.

ABSTRACT

Although much progress has been made towards robust deep learning, a significant gap in robustness remains between real-world perturbations and more narrowly defined sets typically studied in adversarial defenses. In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation. Specifically, we use a conditional generator that defines the perturbation set over a constrained region of the latent space. We formulate desirable properties that measure the quality of a learned perturbation set, and theoretically prove that a conditional variational autoencoder naturally satisfies these criteria. Using this framework, our approach can generate a variety of perturbations at different complexities and scales, ranging from baseline spatial transformations, through common image corruptions, to lighting variations. We measure the quality of our learned perturbation sets both quantitatively and qualitatively, finding that our models are capable of producing a diverse set of meaningful perturbations beyond the limited data seen during training. Finally, we leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations, while improving generalization on non-adversarial data. All code and configuration files for reproducing the experiments as well as pretrained model weights can be found at https://github.com/locuslab/perturbation_learning.

研究动机与目标

推动对现实世界扰动的鲁棒性，超越传统的数学威胁模型。
定义学习扰动集的可取的确定性与概率性属性。
证明基于 CVAE 的扰动集在理论上满足这些属性。
在 MNIST、CIFAR-10 和多光照任务上展示学习到的扰动集，以实现鲁棒训练与评估。

提出的方法

将扰动建模为潜在空间变换，通过生成器 g(z, x)，其中 z 受限于范数球。
提出用于评估扰动集的必要子集与充分似然性属性。
使用条件变分自编码器（CVAE）在先验约束的潜在空间中学习扰动。
给出理论结果，表明在训练假设下 CVAE 满足这两个关键属性。
在 MNIST-RTS、CIFAR10-C 与多光照数据集上，结合下游鲁棒性技术评估扰动集。

实验结果

研究问题

RQ1能否通过 CVAE 从成对数据学习的扰动集真实覆盖现实世界扰动？
RQ2基于 CVAE 的扰动集是否满足包含性和高似然性属性，以实现鲁棒训练与评估？
RQ3学习到的扰动集如何影响对抗鲁棒性以及在不同鲁棒性任务上的泛化？

主要发现

基于 CVAE 的扰动集达到较低的近似误差，指示对扰动数据的良好包含性。
ECM（期望 CVAE 重建）和 KL 指标表明所学习的扰动具有有意义的似然性。
相较于标准数据增强和手工设计的扰动，扰动集实现了更高的对抗鲁棒性和泛化性。
与基线方法相比，使用 CVAE扰动进行对抗训练在各任务中获得更高的鲁棒准确率。
扰动学习可扩展到如常见损坏和照明变化等复杂扰动。
CIFAR10-C 的定量结果显示，通过 CVAE 增强和对抗训练获得更高的鲁棒准确率。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。