QUICK REVIEW

[論文レビュー] Measurable Counterfactual Local Explanations for Any Classifier

Adam White, Artur d’Avila Garcez|arXiv (Cornell University)|Aug 8, 2019

Explainable Artificial Intelligence (XAI)参考文献 20被引用数 54

ひとこと要約

CLEARは、b-counterfactualsを提供し、基盤となる分類器への測定忠実度を伴う回帰ベースの局所モデルを提供することで予測を説明し、5つのデータセットでLIMEより改善。

ABSTRACT

We propose a novel method for explaining the predictions of any classifier. In our approach, local explanations are expected to explain both the outcome of a prediction and how that prediction would change if 'things had been different'. Furthermore, we argue that satisfactory explanations cannot be dissociated from a notion and measure of fidelity, as advocated in the early days of neural networks' knowledge extraction. We introduce a definition of fidelity to the underlying classifier for local explanation models which is based on distances to a target decision boundary. A system called CLEAR: Counterfactual Local Explanations via Regression, is introduced and evaluated. CLEAR generates w-counterfactual explanations that state minimum changes necessary to flip a prediction's classification. CLEAR then builds local regression models, using the w-counterfactuals to measure and improve the fidelity of its regressions. By contrast, the popular LIME method, which also uses regression to generate local explanations, neither measures its own fidelity nor generates counterfactuals. CLEAR's regressions are found to have significantly higher fidelity than LIME's, averaging over 45% higher in this paper's four case studies.

研究の動機と目的

重要な領域での予測に対して信頼できる説明を動機づけるため、counterfactualsと忠実度に焦点を当てる。
基礎となる分類器への局所説明の忠実度を定義し、定量化する。
b-counterfactualsを生成し、局所的な意思決定境界を反映する回帰モデルを構築するCLEARを開発する。
複数のデータセットでCLEARがLIMEより高い忠実度を達成することを示す。

提案手法

b-counterfactual perturbationsを、予測クラスを反転させる最小限の特徴量変更として定義する。
対象インスタンスの周囲にラベル付けされた合成観測を生成する。
決定境界の近傍を横断するバランスの取れた近傍を構築する。
インスタンスを通る局所回帰モデルを適合させる（2次項や相互作用項を含む可能性あり）。
回帰を用いてb-perturbationsを推定し、真のb-perturbationsに対する忠実度誤差を計算する。
忠実度を向上させるために、回帰仕様を反復的に調整し、必要に応じて重み付きb-counterfactualsを追加する。

実験結果

リサーチクエスチョン

RQ1回帰ベースの局所モデルに基づいたcounterfactual explanationsをどのように生成できるか？
RQ2分類器への忠実度測度は局所説明の信頼性を向上させるか？
RQ3近傍にb-counterfactualsを組み込むことは、LIMEのような既存手法と比較して局所説明の忠実度を改善するか？
RQ4どの設定選択（バランスの取れた近傍、センタリング、2次/相互作用項など）がデータセット全体で忠実度を最大化するか？

主な発見

Dataset	CLEAR (not using b-counterfactuals) fidelity	CLEAR (using b-counterfactuals) fidelity	LIME fidelity (baseline)
Adult	80% ± 0.9	80% ± 0.8	26% ± 0.6
Iris	80% ± 1.0	99.8% ± 0.1	30% ± 0.3
Pima	57% ± 0.8	77% ± 0.8	20% ± 0.4
Credit	39% ± 1.3	55% ± 1.7	12% ± 0.5
Breast	54% ± 1.1	81% ± 1.3	14% ± 0.3

CLEARは5データセットにわたり忠実度で一貫してLIMEを上回り、平均して40%以上高い忠実度。
バランスの取れた近傍、センタリング（xを通る回帰）、2次項および相互作用項の含有は、忠実度を高める。
近傍にb-counterfactualsを含めるとさらに忠実度が向上。
CLEARの忠実度は単なる分類精度より厳格な指標であり、LIMEの説明のギャップを明らかにする。
最良の構成はデータセットごとに異なる（例：データセットごとにロジスティック回帰 vs. 重回帰）。
A CLEAR prototype provides interpretable reports with adjustable complexity to balance fidelity and interpretability.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。