QUICK REVIEW

[論文レビュー] Adversarial purification with Score-based generative models

Jongmin Yoon, Sung Ju Hwang|arXiv (Cornell University)|Jun 11, 2021

Adversarial Robustness in Machine Learning参考文献 46被引用数 34

ひとこと要約

本論文は、denoising score-matching を用いて訓練されたエネルギー-based model を用いた adversarial purification を導入し、攻撃を受けた画像の高速で頑健な浄化と、認証可能な頑健性を得るためのランダム化ノイズ強化を可能にする。

ABSTRACT

While adversarial training is considered as a standard defense method against adversarial attacks for image classifiers, adversarial purification, which purifies attacked images into clean images with a standalone purification model, has shown promises as an alternative defense method. Recently, an Energy-Based Model (EBM) trained with Markov-Chain Monte-Carlo (MCMC) has been highlighted as a purification model, where an attacked image is purified by running a long Markov-chain using the gradients of the EBM. Yet, the practicality of the adversarial purification using an EBM remains questionable because the number of MCMC steps required for such purification is too large. In this paper, we propose a novel adversarial purification method based on an EBM trained with Denoising Score-Matching (DSM). We show that an EBM trained with DSM can quickly purify attacked images within a few steps. We further introduce a simple yet effective randomized purification scheme that injects random noises into images before purification. This process screens the adversarial perturbations imposed on images by the random noises and brings the images to the regime where the EBM can denoise well. We show that our purification method is robust against various attacks and demonstrate its state-of-the-art performances.

研究の動機と目的

敵対的訓練とは別個の有効な防御として adversarial purification を動機づける。
浄化のためのスコアベースモデルを訓練するために denoising score matching を活用する。
adaptive なステップサイズを用いた決定的な更新浄化スキームを開発する。
ランダムノイズの注入による頑健性を強化し、ランダム化平滑化を通じた認証可能性を示す。
標準データセット上で強力な適応攻撃に対して最先端の性能を示す。

提案手法

浄化のためのスコア関数を学習するために denoising score matching (dsm) を用いてエネルギー-based model を訓練する。
異なる摂動レベルを扱うために multi-scale Noise Conditional Score Network (NCSN) を用いる。
Langevin dynamics ではなく、学習したスコアに導かれた決定的な更新で浄化を実行する。
浄化前にランダムなガウスノイズを注入して頑健性を向上させ、ランダム化平滑化を可能にする。
スコアランドスケープでの制御された降下を確保するために浄化中のステップサイズを適応的に調整する。
複数回のランダム化浄化を組み合わせ、最終予測のために出力をアンサンブルする。

実験結果

リサーチクエスチョン

RQ1denoising score matching で訓練されたスコアベースのモデルは、従来の MCMC ベースの EBMs より速く敵対的な例を浄化できるか。
RQ2浄化前のノイズ注入は、ノルムで制限された脅威モデル下で認証可能な頑健性を生み出すか。
RQ3提案手法は、既存の adversarial purification 手法と比較して強力な適応攻撃に対してどのように性能を示すか。
RQ4決定的な浄化更新は、浄化の速度と精度の点で確率的 Langevin dynamics より望ましいか。
RQ5適応的なステップサイズは、ハイパーパラメータの過度な調整を要することなく浄化の安定性を改善できるか。

主な発見

dsm ベースの EBMs によるスコア指向の決定的浄化は、従来の長時間実行の MCMC アプローチより attacked 画像をはるかに速く浄化する。
浄化前のランダムなガウスノイズは頑健性を向上させ、ランダム化平滑化を可能にし、特定の脅威モデル下で認証可能な頑健性を達成する。
本手法は一連の適応攻撃に対して強い頑健性を示し、CIFAR-10/100 および他データセットで既存の adversarial purification および adversarial training のベースラインと比較して好ましい結果を得る。
DSM で訓練された NCSN による多スケールのスコアネットワークは、異なる攻撃強度とデータ摂動を跨いで浄化を向上させる。
適応的なステップサイズ戦略は浄化をさらに安定化させ、攻撃下での最終分類精度を向上させる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。