QUICK REVIEW

[論文レビュー] Do Explanations Reflect Decisions? A Machine-centric Strategy to Quantify the Performance of Explainability Algorithms

Zhong Qiu Lin, Mohammad Javad Shafiee|arXiv (Cornell University)|Oct 16, 2019

Adversarial Robustness in Machine Learning参考文献 22被引用数 68

ひとこと要約

論文は、機械中心のフレームワーク（Impact Score と Impact Coverage）を提案し、LIME、SHAP、Expected Gradients、GSInquire などの説明性手法がニューラルネットの意思決定に実際どの程度影響を与えるかを、通常条件と敵対的条件下でImageNetデータのResNet-50で定量評価する。

ABSTRACT

There has been a significant surge of interest recently around the concept of explainable artificial intelligence (XAI), where the goal is to produce an interpretation for a decision made by a machine learning algorithm. Of particular interest is the interpretation of how deep neural networks make decisions, given the complexity and `black box' nature of such networks. Given the infancy of the field, there has been very limited exploration into the assessment of the performance of explainability methods, with most evaluations centered around subjective visual interpretation of the produced interpretations. In this study, we explore a more machine-centric strategy for quantifying the performance of explainability methods on deep neural networks via the notion of decision-making impact analysis. We introduce two quantitative performance metrics: i) Impact Score, which assesses the percentage of critical factors with either strong confidence reduction impact or decision changing impact, and ii) Impact Coverage, which assesses the percentage coverage of adversarially impacted factors in the input. A comprehensive analysis using this approach was conducted on several state-of-the-art explainability methods (LIME, SHAP, Expected Gradients, GSInquire) on a ResNet-50 deep convolutional neural network using a subset of ImageNet for the task of image classification. Experimental results show that the critical regions identified by LIME within the tested images had the lowest impact on the decision-making process of the network (~38%), with progressive increase in decision-making impact for SHAP (~44%), Expected Gradients (~51%), and GSInquire (~76%). While by no means perfect, the hope is that the proposed machine-centric strategy helps push the conversation forward towards better metrics for evaluating explainability methods and improve trust in deep neural networks.

研究の動機と目的

機械中心で定量的な explainability 手法の評価を、主観的な視覚的解釈を超えて動機づける。
識別された重要因子がネットワークの意思決定と信頼度に与える影響を測る指標（Impact Score と Impact Coverage）を定義する。
画像分類タスク上で、通常条件と敵対的条件下で最先端の explainability 手法（LIME、SHAP、Expected Gradients、GSInquire）を系統的に比較する。

提案手法

Explainability 手法 M が識別した臨界因子 c を重要と定義する条件は、(i) c を除去して意思決定が変化する、または (ii) 意思決定の信頼度 z が閾値 tau（0.5）以上低下する、のいずれかである。
Impact Score I を、入力ごとに「意思決定が c なしで変化した」または「信頼度が tau の低下を示した」指標の平均として計算する。
I_strict の方を、意思決定の変化のみを用いて信頼度喪失基準を除外する形で計算する。
Adversarial Impact Coverage I_coverage を、敵対的に影響を受けた因子と臨界因子との入力横断的な交差対結合法の平均として定義する。
一般条件および adversarial patch 条件下で、ResNet-50 ImageNet のサブセット上で four つの explainability 手法（LIME、SHAP、Expected Gradients、GSInquire）を評価する。
I、I_strict、I_coverageを用いて、一般条件と敵対的条件の両方の性能を比較する。

実験結果

リサーチクエスチョン

RQ1異なる explainability 手法から識別された臨界因子は、ニューラルネットの実際の意思決定過程をどれだけ反映しているか。
RQ2新しい勾配情報に基づく手法（例：GSInquire、Expected Gradients）は、proxy 手法（LIME、SHAP）と比べて意思決定への影響や信頼度への影響が高い説明を生み出すか。
RQ3敵対的な分散要因下で、意思決定への影響と敵対的に影響を受けた因子のカバレージはどのように変化するか。

主な発見

GSInquire は意思決定への影響が最も大きく（I ≈ 76.10%）、信頼度への影響も顕著で（I_strict ≈ 50.73%）、他の手法を上回る。
Expected Gradients は SHAP より影響力が強く、一般シナリオで I ≈ 51.22%、I_strict ≈ 47.80%。
SHAP は LIME を上回るが、意思決定への影響は GSInquire および Expected Gradients に劣る（I ≈ 44.15%、I_strict ≈ 40.24%）。
LIME は four 手法の中で一般シナリオにおける影響指標が最も低く（I ≈ 38.05%、I_strict ≈ 35.12%）、他の手法を下回る。
敵対的な分散下では、LIME は I、I_strict、I_coverage が最も低く、GSInquire はパッチスケール全体で I、I_strict、I_coverage を最も高く、敵対的に影響を受けた領域の特定において優れている。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。