QUICK REVIEW

[論文レビュー] Defining Locality for Surrogates in Post-hoc Interpretablity

Thibault Laugel, Xavier Renard|arXiv (Cornell University)|Jun 19, 2018

Explainable Artificial Intelligence (XAI)参考文献 5被引用数 46

ひとこと要約

論文はローカルサンプルの局所性が忠実度に与える影響を批判的に示し、意思決定境界周辺の Local Surrogate (LS) サンプリングを導入して複数のデータセットで LIME より局所的説明を改善します。

ABSTRACT

Local surrogate models, to approximate the local decision boundary of a black-box classifier, constitute one approach to generate explanations for the rationale behind an individual prediction made by the back-box. This paper highlights the importance of defining the right locality, the neighborhood on which a local surrogate is trained, in order to approximate accurately the local black-box decision boundary. Unfortunately, as shown in this paper, this issue is not only a parameter or sampling distribution challenge and has a major impact on the relevance and quality of the approximation of the local black-box decision boundary and thus on the meaning and accuracy of the generated explanation. To overcome the identified problems, quantified with an adapted measure and procedure, we propose to generate surrogate-based explanations for individual predictions based on a sampling centered on particular place of the decision boundary, relevant for the prediction to be explained, rather than on the prediction itself as it is classically done. We evaluate the novel approach compared to state-of-the-art methods and a straightforward improvement thereof on four UCI datasets.

研究の動機と目的

個別予測の local surrogate 説明の局所性が重要である理由を動機づける。
標準的なサンプリング（LIME におけるもの）が局所的に影響力のある特徴を覆い隠す可能性があることを示す。
意思決定境界をターゲットとするサンプリング戦略を提案し、局所的忠実度を改善する。
合成データと実データセットで LS の局所忠実度が LIME より向上することを示す。

提案手法

局所サロゲートの三段階プロセスを説明する：訓練空間をサンプルし、解釈可能な surrogate をフィットさせ、説明を抽出する。
LIME のサンプリングと重み付けスキームを分析し、それが局所的な特徴よりもグローバルな特徴を強調する可能性を示す。
Local Surrogate (LS) を導入：GrowingSpheres を用いて最も近い意思決定境界点を検出し、その境界の周りをサンプリングして surrogate を訓練する。
局所忠実度を、半径 r_fid を用いて x の周辺の局所領域内での surrogate b(x) と s_x の精度として定義する。
合成データ（半月型）と四つの UCI データセットを Local Fidelity を評価指標として使用し、LS を LIME および LIME-K と比較する。
全データセットで LS が局所忠実度をより高く達成することを報告する。

実験結果

リサーチクエスチョン

RQ1局所サロゲートを訓練するために用いられる近傍の定義は、ブラックボックスの意思決定境界の近似精度にどのように影響するか。
RQ2意思決定境界の周りで直接サンプリングすることは、グローバルまたはインスタンス中心のサンプリングと比べて局所忠実度を改善できるか。
RQ3境界中心サンプリングで訓練された局所サロゲートは、多様なデータセットに対して標準の LIME ベースのアプローチを上回るか。

主な発見

データセット	LIME	LIME-K	LS
1/2 moons	0.89 (0.07)	0.96 (0.06)	0.97 (0.03)
cancer	0.86 (0.07)	0.87 (0.07)	0.96 (0.02)
credit	0.67 (0.21)	0.70 (0.18)	0.85 (0.12)
news	0.64 (0.10)	0.67 (0.10)	0.79 (0.07)
tennis	0.85 (0.12)	0.83 (0.13)	0.98 (0.02)

サロゲート説明の局所忠実度は、グローバルな特徴を強調するサンプリングが局所的特徴を軽視する場合に低下する。
LIME の標準サンプリングは、真の局所境界よりもグローバルなパターンに一致する意思決定境界を生み出す可能性がある。
境界中心のサンプリング戦略（LS）は局所忠実度を改善し、より忠実な局所説明をもたらす。
データセットを跨いで、LS は LIME および LIME-K より平均局所忠実度（AUC）を高く、分散を低く達成する。
報告された表では、LS は局所忠実度において一貫して他の選択肢を上回っている。
このアプローチは半月型と four UCI データセット（Breast Cancer, Default of Credit Card Clients, Online News Popularity, Tennis）で検証されている。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。