QUICK REVIEW

[論文レビュー] Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach

Minhao Cheng, Thong Le|arXiv (Cornell University)|Jul 12, 2018

Adversarial Robustness in Machine Learning被引用数 151

ひとこと要約

ハードラベル・ブラックボックス攻撃に対して勾配を用いない最適化ベースのフレームワークを導入し、収束保証を提供、CNNとGBDTに対するMNIST、CIFAR、ImageNetでのクエリ効率の高い成功を示す。

ABSTRACT

We study the problem of attacking a machine learning model in the hard-label black-box setting, where no model information is revealed except that the attacker can make queries to probe the corresponding hard-label decisions. This is a very challenging problem since the direct extension of state-of-the-art white-box attacks (e.g., CW or PGD) to the hard-label black-box setting will require minimizing a non-continuous step function, which is combinatorial and cannot be solved by a gradient-based optimizer. The only current approach is based on random walk on the boundary, which requires lots of queries and lacks convergence guarantees. We propose a novel way to formulate the hard-label black-box attack as a real-valued optimization problem which is usually continuous and can be solved by any zeroth order optimization algorithm. For example, using the Randomized Gradient-Free method, we are able to bound the number of iterations needed for our algorithm to achieve stationary points. We demonstrate that our proposed method outperforms the previous random walk approach to attacking convolutional neural networks on MNIST, CIFAR, and ImageNet datasets. More interestingly, we show that the proposed algorithm can also be used to attack other discrete and non-continuous machine learning models, such as Gradient Boosting Decision Trees (GBDT).

研究の動機と目的

Hard-label black-box 設定の下で、最終決定のみ観測可能であるモデルの脆弱性を動機づける。
hard-label attacks を連続実数値最適化問題として再定式化し、 zeroth-order 最適化を可能にする。
限られたクエリで敵対的サンプルを見つけるための理論的に基づく収束保証付きアルゴリズムを提供する。
CNNとGradient Boosting Decision Trees (GBDT) に対する有効性とクエリ効iciency を実証する。

提案手法

攻撃を、θ に沿う最も近い敵対的サンプルまでの距離を検索方向に応答させる境界ベースの実数値目的関数 g(θ) に再定式化する。
g(θ) を hard-label クエリを用いて、θ に沿った決定境界を局在化する二段階プロセス（細粒度検索と二分探索）で計算する。
Randomized Gradient-Free (RGF) 最適化を適用し、ノイズ評価のある g の実数値評価に基づく zeroth-order 勾配推定を用いて g(θ) を最小化する。
各反復で複数のランダムなガウス摂動を適用して勾配推定を安定化させ、ステップサイズを適応させるバックトラック法を用いる。
理論的な収束保証を提供：∇g のリプシッツ条件と評価誤差 ε の制御下で、アルゴリズムは O(d/δ^2) 回の反復でほぼ停留点へ到達する。
neural networks を越えて離散モデルである Gradient Boosting Decision Trees (GBDT) へ適用可能性を実証する。

実験結果

リサーチクエスチョン

RQ1hard-label black-box adversarial attacks を zeroth-order 法に適した実数値最適化問題として定式化できるか？
RQ2この設定で Randomized Gradient-Free 最適化を用いた場合の収束保証とクエリ複雑性はどうなるか？
RQ3提案手法は、Distortion とクエリ効率の点で既存の decision-based black-box attacks と比べてどうか？
RQ4GBDT のような非微分可能モデルにも適用可能か、厳しいクエリ予算下でどのような敵対的サンプルを見つけられるか？

主な発見

境界ベースの再定式化 g(θ) は zeroth-order 最適化に適した連続目的を与える。
近似的な関数評価を伴う RGF は、∇g がリプシッツ条件を満たし誤差が制御されていれば収束点へ収束する。
本手法は、MNIST、CIFAR-10、および ImageNet において untargeted シナリオで prior の決定基盤型black-box 攻撃よりもはるかに少ないクエリで、歪みが低いまたは同等の歪みを達成する。
ターゲット攻撃では、MNISTと CIFAR-10 でより少ないクエリで競争力のある歪みを達成し、ImageNet ではより多くのクエリで現実的な適用性を維持する。
このアプローチは約 30,000 クエリで Gradient Boosting Decision Trees (GBDT) への攻撃に成功しており、非微分可能モデルへの適用可能性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。