Skip to main content
QUICK REVIEW

[Paper Review] Attacking the Madry Defense Model with $L_1$-based Adversarial Examples

Yash Sharma, Pin‐Yu Chen|arXiv (Cornell University)|Oct 30, 2017
Adversarial Robustness in Machine Learning9 references75 citations
TL;DR

The paper shows that L1-based elastic-net adversarial examples (EAD) transfer to the Madry Defense Model and can outperform L2/L∞-based PGD/I-FGM in targeted transfer, while often causing less visual distortion than comparable L∞-based attacks.

ABSTRACT

The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal $L_\infty$ distortion $ε$ = 0.3. This discourages the use of attacks which are not optimized on the $L_\infty$ distortion metric. Our experimental results demonstrate that by relaxing the $L_\infty$ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average $L_\infty$ distortion, have minimal visual distortion. These results call into question the use of $L_\infty$ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.

Motivation & Objective

  • Assess transferability of adversarial attacks beyond the L∞ constraint used in the Madry MNIST challenge.
  • Compare EAD (L1+L2 regularization) against PGD and I-FGM (L∞-based) under relaxed distortion budgets.
  • Evaluate targeted and non-targeted transferability from undefended models and ensembles.
  • Analyze visual distortion vs. distortion metrics (L1, L2, L∞) in transferred adversarial examples.

Proposed method

  • Use Madry Defense Model trained with PGD ε = 0.3 on MNIST.
  • Generate adversarial examples with EAD (elastic-net: L1 + L2) and tune beta to vary L1/L2 emphasis.
  • Compare against PGD and I-FGM under various ε and κ settings.
  • Evaluate targeted and non-targeted transferability from undefended models and a 3-model ensemble.
  • Analyze visual distortion using L1/L2/L∞ norms and provide qualitative visuals.

Experimental results

Research questions

  • RQ1Do L1-based EAD adversarial examples transfer to the Madry Defense Model as effectively as or better than L∞-based PGD/I-FGM?
  • RQ2How do L1 and L2 distortions impact transferability and perceived visual distortion compared with L∞ constraints?
  • RQ3Does using an ensemble of undefended models enhance transferability of EAD attacks?
  • RQ4What are the trade-offs between attack success rate and distortion types (L1/L2/L∞) for targeted vs. non-targeted attacks?

Key findings

  • EAD outperforms C&W across κ settings in both targeted and non-targeted cases.
  • In targeted attacks, at an optimal κ (e.g., 50), EAD achieves lower L1/L2 distortion and higher transferability than PGD/I-FGM.
  • EAD with β=0.01 often yields the highest ASR while minimizing L1 distortion, especially at lower κ.
  • PGD/I-FGM can reach high ASR but incur large L1/L2 distortions, leading to visually perceptible perturbations.
  • Visual comparisons show EAD can preserve visual quality even with similar average L∞ distortion to PGD.
  • Results suggest L∞ alone is insufficient to characterize visual distortion and adversarial subspaces.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.