QUICK REVIEW

[論文レビュー] "Why Should I Trust You?": Explaining the Predictions of Any Classifier

Marco Túlio Ribeiro, Sameer Singh|arXiv (Cornell University)|Feb 16, 2016

Adversarial Robustness in Machine Learning参考文献 25被引用数 346

ひとこと要約

本論文は、忠実で局所的な代理解釈可能モデルを用いて個別の予測を説明するモデルに依存しない方法であるLIMEと、モデルをグローバルに評価するための代表的な説明を選択するSP-LIMEを紹介する。テキスト分類器と画像分類器の忠実度と信頼性の向上を示す実験を、人間とシミュレーションの両方で行っている。

ABSTRACT

Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

研究の動機と目的

予測と実世界で展開されるモデルへの信頼性を高めるための説明の必要性を動機づける。
予測周辺で局所的に忠実で解釈可能なモデルを学習して、任意の分類器を説明するLIMEを提案する。
グローバルなモデル信頼を得るために、多様で代表的な説明の集合を選択するSP-LIMEを導入する。
信頼に関連するタスクでのシミュレーションと人間による研究を通じて、説明の有用性を示す。

提案手法

入力の解釈可能な表現を定義する（テキストは単語の出現として、画像は超ピクセルとして）。
黒箱 f を、インスタンス x の周りで局所的に近似する、単純で解釈可能なファミリー G のモデル g として説明を定式化する。
局所性で重み付けされた損失 L(f,g,πx) と複雑さペナルティ Ω(g) を最小化して、説明 ξ(x) を得る。
x′ の周りの摂動と近接性カーネル πx を用いて、局所代替モデルを f の出力に適合させる。
テキストと画像について、L2 損失と L1 ベースのスパース性ステップ（K-LASSO）を用いた、スパースな線形説明(g(z′)=w·z′) に特化する。
実用的なアルゴリズム（Algorithm 1）を提示し、複雑さと解釈性のトレードオフについて議論する。

実験結果

リサーチクエスチョン

RQ1個別の予測に対して、説明はモデルの挙動を忠実に反映するのか。
RQ2説明はユーザーが予測を信頼し、モデル間の選択を助けるのか。
RQ3小さくて冗長ではない説明セットから、モデル全体の理解を構築できるのか。
RQ4モデルに依存しない説明器は、多様なモデル（テキスト、画像、ニューラルネット）を説明できるのか。

主な発見

LIME の説明は、局所的な近傍の基盤となるモデルに対して高い忠実度を達成する（例：2つの解釈可能な分類器で、真に重要な特徴のリコールが>90%）。
説明は個別の予測に対する信頼を高め、モデルの使用に関する意思決定と信頼できないモデルの回避を改善する。
SP-LIME（サブモジュラ PICK）は、多様で代表的な説明のセットを選択し、モデル比較や信頼ベースの選択といったタスクを改善する。
定性的な例は、クラスに寄与する単語や超ピクセルなど、直感的で人間に理解しやすい寄与度を示す。
シミュレーションと人間の実験は、どの分類器がより一般化するかを予測したり、特徴エンジニアリングを導くなどのタスクを説明が支援することを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。