QUICK REVIEW

[論文レビュー] Adversarial Texts with Gradient Methods

Zhitao Gong, Wenlu Wang|arXiv (Cornell University)|Jan 22, 2018

Adversarial Robustness in Machine Learning被引用数 55

ひとこと要約

この論文は、埋め込み空間を探索し、最近傍による再構成を用いて、画像からテキストへの勾配ベースの敵対的攻撃を適用する。Qualityを測る指標としてWord Mover's Distanceを使用し、IMDBとReutersデータセットでFGMとDeepFoolを高品質で小規模な語の変更の敵対テキストとして実証する。

ABSTRACT

Adversarial samples for images have been extensively studied in the literature. Among many of the attacking methods, gradient-based methods are both effective and easy to compute. In this work, we propose a framework to adapt the gradient attacking methods on images to text domain. The main difficulties for generating adversarial texts with gradient methods are i) the input space is discrete, which makes it difficult to accumulate small noise directly in the inputs, and ii) the measurement of the quality of the adversarial texts is difficult. We tackle the first problem by searching for adversarials in the embedding space and then reconstruct the adversarial texts via nearest neighbor search. For the latter problem, we employ the Word Mover's Distance (WMD) to quantify the quality of adversarial texts. Through extensive experiments on three datasets, IMDB movie reviews, Reuters-2 and Reuters-5 newswires, we show that our framework can leverage gradient attacking methods to generate very high-quality adversarial texts that are only a few words different from the original texts. There are many cases where we can change one word to alter the label of the whole piece of text. We successfully incorporate FGM and DeepFool into our framework. In addition, we empirically show that WMD is closely related to the quality of adversarial texts.

研究の動機と目的

勾配攻撃を離散的なテキスト入力に適用する課題に対処する。
埋め込み空間で動作し、最近傍探索でテキストを再構成するフレームワークを開発する。
Word Mover's Distance (WMD) で敵対的テキストの品質を定量化する。
FGMやDeepFoolのような勾配法をフレームワークに統合する。
標準データセットで少数語の変更でテキストラベルが反転することを示す。

提案手法

離散入力の問題を回避するため、埋め込み空間で敵対的例を探索する。
最近傍検索を介して敵対的テキストを再構成し、埋め込みを単語へ mappings に戻す。
敵対的テキストの品質を定量化するために Word Mover's Distance を使用する。
FGMやDeepFoolなどの勾配ベース攻撃をテキストフレームワークに組み込む。
IMDB、Reuters-2、Reuters-5 データセットで有効性を評価する。

実験結果

リサーチクエスチョン

RQ1勾配ベースの敵対的攻撃を離散的なテキスト領域に適用するにはどうすればよいか？
RQ2テキストへ再構成された埋め込み空間の敵対例はどれくらい有効か？
RQ3Word Mover's Distance と敵対的テキストの知覚品質の関係はどうなるか？
RQ4標準データセットでテキストラベルを変更するには通常どれくらいの語の変更が必要か？

主な発見

フレームワークは、少数の語の変更で高品質な敵対的テキストを生成できる。
1語の変更でテキストのラベルが反転するケースが多く存在する。
WMD は敵対的テキストの知覚品質と密接な関係がある。
FGMとDeepFoolをフレームワークにうまく組み込むことができる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。