QUICK REVIEW

[論文レビュー] Boosting Adversarial Attacks with Momentum

Yinpeng Dong, Fangzhou Liao|arXiv (Cornell University)|Oct 17, 2017

Adversarial Robustness in Machine Learning参考文献 29被引用数 50

ひとこと要約

論文では momentum-based iterative gradient methods (MI-FGSM and variants) を用いて adversarial attacks を強化し、white-box 強度と black-box 転移性を向上させ、アンサンブルモデル攻撃で堅牢防御を破ることを示す。

ABSTRACT

Deep neural networks are vulnerable to adversarial examples, which poses security concerns on these algorithms due to the potentially severe consequences. Adversarial attacks serve as an important surrogate to evaluate the robustness of deep learning models before they are deployed. However, most of existing adversarial attacks can only fool a black-box model with a low success rate. To address this issue, we propose a broad class of momentum-based iterative algorithms to boost adversarial attacks. By integrating the momentum term into the iterative process for attacks, our methods can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples. To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks. We hope that the proposed methods will serve as a benchmark for evaluating the robustness of various deep models and defense methods. With this method, we won the first places in NIPS 2017 Non-targeted Adversarial Attack and Targeted Adversarial Attack competitions.

研究の動機と目的

Motivate robust evaluation of deep models against adversarial threats.
Develop momentum-based iterative attack methods to stabilize updates and improve transferability.
Demonstrate effectiveness of attacking ensembles of models to enhance black-box success.
Show vulnerability of ensemble adversarially trained models to strong attacks.

提案手法

Introduce momentum into iterative gradient attacks (MI-FGSM) to stabilize update directions via accumulated gradient g_{t+1} = μ g_t + grad(J(x_t*, y))/||grad(J)||_1 and x_{t+1}* = x_t* + α sign(g_{t+1}).
Extend momentum to L2 norm and targeted attacks; provide corresponding update rules.
Propose attacking ensembles by fusing logits: l(x) = sum_k w_k l_k(x); optimize J(x, y) using ensemble logits.
Compare ensemble schemes (logits, predictions, loss) and show ensemble logits yields strongest attacks.
Experiment on ImageNet with seven models; show MI-FGSM outperforms FGSM and I-FGSM in black-box transfers and white-box strength.

実験結果

リサーチクエスチョン

RQ1How can momentum be integrated into iterative gradient-based attacks to enhance adversarial example transferability?
RQ2Does attacking an ensemble of models improve black-box attack success rates, especially against defended models?
RQ3Are ensemble-logits based attacks more effective than ensemble-predictions or ensemble-loss approaches?
RQ4Do momentum-based attacks threaten models trained with ensemble adversarial training?

主な発見

Attack	Inc-v3	Inc-v4	IncRes-v2	Res-152	Inc-v3 ens3	Inc-v3 ens4	IncRes-v2 ens
FGSM	72.3*	28.2	26.2	25.3	11.3	10.9	4.8
I-FGSM	100.0*	22.8	19.9	16.2	7.5	6.4	4.1
MI-FGSM	100.0*	48.8	48.0	35.6	15.1	15.2	7.8

MI-FGSM achieves near-100% success on white-box models and substantially increases black-box success rates compared to I-FGSM and FGSM.
Momentum (μ around 1.0) stabilizes update directions and improves transferability across multiple black-box models.
Ensemble-in-logits attacks outperform ensemble-in-ppredictions or ensemble-in-loss approaches across models.
Adversarially trained ensembles remain vulnerable to MI-FGSM black-box attacks, with substantial success rates (e.g., up to ~40% on some defenses).
Attacks won first places in NIPS 2017 Non-targeted and Targeted Adversarial Attack competitions.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。