QUICK REVIEW

[Paper Review] Attacking the Madry Defense Model with $L_1$-based Adversarial Examples

Yash Sharma, Pin‐Yu Chen|arXiv (Cornell University)|Oct 30, 2017

Adversarial Robustness in Machine Learning9 references75 citations

TL;DR

The paper shows that L1-based elastic-net adversarial examples (EAD) transfer to the Madry Defense Model and can outperform L2/L∞-based PGD/I-FGM in targeted transfer, while often causing less visual distortion than comparable L∞-based attacks.

ABSTRACT

The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal $L_\infty$ distortion $ε$ = 0.3. This discourages the use of attacks which are not optimized on the $L_\infty$ distortion metric. Our experimental results demonstrate that by relaxing the $L_\infty$ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average $L_\infty$ distortion, have minimal visual distortion. These results call into question the use of $L_\infty$ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.

Motivation & Objective

Assess transferability of adversarial attacks beyond the L∞ constraint used in the Madry MNIST challenge.
Compare EAD (L1+L2 regularization) against PGD and I-FGM (L∞-based) under relaxed distortion budgets.
Evaluate targeted and non-targeted transferability from undefended models and ensembles.
Analyze visual distortion vs. distortion metrics (L1, L2, L∞) in transferred adversarial examples.

Proposed method

Use Madry Defense Model trained with PGD ε = 0.3 on MNIST.
Generate adversarial examples with EAD (elastic-net: L1 + L2) and tune beta to vary L1/L2 emphasis.
Compare against PGD and I-FGM under various ε and κ settings.
Evaluate targeted and non-targeted transferability from undefended models and a 3-model ensemble.
Analyze visual distortion using L1/L2/L∞ norms and provide qualitative visuals.

Experimental results

Research questions

RQ1Do L1-based EAD adversarial examples transfer to the Madry Defense Model as effectively as or better than L∞-based PGD/I-FGM?
RQ2How do L1 and L2 distortions impact transferability and perceived visual distortion compared with L∞ constraints?
RQ3Does using an ensemble of undefended models enhance transferability of EAD attacks?
RQ4What are the trade-offs between attack success rate and distortion types (L1/L2/L∞) for targeted vs. non-targeted attacks?

Key findings

EAD outperforms C&W across κ settings in both targeted and non-targeted cases.
In targeted attacks, at an optimal κ (e.g., 50), EAD achieves lower L1/L2 distortion and higher transferability than PGD/I-FGM.
EAD with β=0.01 often yields the highest ASR while minimizing L1 distortion, especially at lower κ.
PGD/I-FGM can reach high ASR but incur large L1/L2 distortions, leading to visually perceptible perturbations.
Visual comparisons show EAD can preserve visual quality even with similar average L∞ distortion to PGD.
Results suggest L∞ alone is insufficient to characterize visual distortion and adversarial subspaces.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.