QUICK REVIEW

[Paper Review] Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?

Vikash Sehwag, Saeed Mahloujifar|arXiv (Cornell University)|Apr 19, 2021

Adversarial Robustness in Machine Learning31 citations

TL;DR

The paper formalizes robustness transfer from proxy (synthetic) distributions to real data using conditional Wasserstein distance, and shows that diffusion-based generators improve adversarial robustness and certified robustness when used in PORT (robust training with proxy data).

ABSTRACT

While additional training data improves the robustness of deep neural networks against adversarial examples, it presents the challenge of curating a large number of specific real-world samples. We circumvent this challenge by using additional data from proxy distributions learned by advanced generative models. We first seek to formally understand the transfer of robustness from classifiers trained on proxy distributions to the real data distribution. We prove that the difference between the robustness of a classifier on the two distributions is upper bounded by the conditional Wasserstein distance between them. Next we use proxy distributions to significantly improve the performance of adversarial training on five different datasets. For example, we improve robust accuracy by up to 7.5% and 6.7% in $\\ell_{\\infty}$ and $\\ell_2$ threat model over baselines that are not using proxy distributions on the CIFAR-10 dataset. We also improve certified robust accuracy by 7.6% on the CIFAR-10 dataset. We further demonstrate that different generative models bring a disparate improvement in the performance in robust training. We propose a robust discrimination approach to characterize the impact of individual generative models and further provide a deeper understanding of why current state-of-the-art in diffusion-based generative models are a better choice for proxy distribution than generative adversarial networks.

Motivation & Objective

Motivate and formalize using proxy distributions (synthetic data) to improve adversarial robustness without collecting more real data.
Derive bounds showing robustness transfer is governed by conditional Wasserstein distance between proxy and real data distributions.
Propose ARC as a practical surrogate metric to rank proxy distributions for robustness transfer.
Develop PORT, a robust training framework that combines real and synthetic data to improve both clean and robust accuracy.

Proposed method

Define average robustness and decompose robustness transfer into empirical robustness, generalization penalty, and distribution-shift penalty.
Introduce conditional Wasserstein distance cwd to bound the distribution-shift penalty between real and proxy distributions.
Propose ARC (area under robust discrimination accuracy vs perturbation) as a practical surrogate for cwd and show its relation (cwd ≥ 4*ARC).
Develop PORT by aggregating losses over real and proxy data with a mixing parameter γ and adversarial training objective (PGD or randomized smoothing).
Introduce robust discriminators to measure proximity between distributions under adversarial perturbations and define synthetic-score to select synthetic samples.
Demonstrate via five datasets that diffusion-based proxies outperform GANs for robustness transfer.

Experimental results

Research questions

RQ1Q1: When does robustness transfer from proxy distributions to the real data distribution?
RQ2Q2: How effective are proxy distributions in boosting adversarial robustness on real-world datasets?
RQ3Q3: Can we develop a metric to predict which proxy distribution will best aid robust training?
RQ4Q4: How do different generative models (diffusion vs GANs) impact robustness transfer and certified robustness?

Key findings

Proxy distributions can significantly boost adversarial robustness and certified robustness across multiple datasets and threat models.
Robust accuracy gains reach up to 7.5% (ℓ∞) and 6.7% (ℓ2); CIFAR-10 certified robustness improves by 7.6%.
Diffusion-based generators outperform GANs as proxy distributions for robustness transfer.
ARC effectively predicts robustness transfer rankings and aligns with cwd, outperforming FID/IS as proxies for robustness transfer.
Adaptive sampling of synthetic data yields small additional gains in robustness.
PORT with synthetic data can match or exceed baseline robustness while using fewer real-world samples and strengthening both clean and robust accuracy.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.