QUICK REVIEW

[論文レビュー] Evaluating Explanation Without Ground Truth in Interpretable Machine Learning

Fan Yang, Mengnan Du|arXiv (Cornell University)|Jul 16, 2019

Explainable Artificial Intelligence (XAI)参考文献 70被引用数 38

ひとこと要約

本論文は、ground-truth explanations がない状態で解釈可能な機械学習における説明を評価する方法を定義・検討し、一般化可能で忠実性が高く説得力のある基準と、評価の統一的階層的フレームワークを提案する。

ABSTRACT

Interpretable Machine Learning (IML) has become increasingly important in many real-world applications, such as autonomous cars and medical diagnosis, where explanations are significantly preferred to help people better understand how machine learning systems work and further enhance their trust towards systems. However, due to the diversified scenarios and subjective nature of explanations, we rarely have the ground truth for benchmark evaluation in IML on the quality of generated explanations. Having a sense of explanation quality not only matters for assessing system boundaries, but also helps to realize the true benefits to human users in practical settings. To benchmark the evaluation in IML, in this article, we rigorously define the problem of evaluating explanations, and systematically review the existing efforts from state-of-the-arts. Specifically, we summarize three general aspects of explanation (i.e., generalizability, fidelity and persuasibility) with formal definitions, and respectively review the representative methodologies for each of them under different tasks. Further, a unified evaluation framework is designed according to the hierarchical needs from developers and end-users, which could be easily adopted for different scenarios in practice. In the end, open problems are discussed, and several limitations of current evaluation techniques are raised for future explorations.

研究の動機と目的

Clarify the problem of evaluating explanations in IML without ground truth ground truths.
Define three core properties of explanations: generalizability, fidelity, and persuasibility.
Review existing evaluation methods across different explanation types and applications.
Propose a unified, hierarchical evaluation framework aligned with developer and end-user needs.

提案手法

解釈の範囲（グローバル/ローカル）と解釈の方法（intrinsic/posthoc）という2次元スキームで説明を分類する。
一般化可能性、忠実性、説得可能性を正確な定義で形式的に定義する。
各特性に対応する既存の評価方法論をタスク全体で系統的に検討する。
一般化可能性、忠実性、説得可能性に対応する3段階の統一的階層的評価フレームワークを提案する。
説明評価のベンチマークに関する未解決の問題、制約、今後の方向性について論じる。

実験結果

リサーチクエスチョン

RQ1How can explanations in IML be evaluated when there is no ground-truth explanation?
RQ2What formal properties best capture explanation quality in IML across tasks?
RQ3How can a unified framework support benchmarking explanations for both developers and end-users?
RQ4What are the key open problems and limitations in current evaluation techniques for explanations?
RQ5How should evaluation frameworks handle local vs. global and intrinsic vs. posthoc explanations?

主な発見

Three general properties—generalizability, fidelity, and persuasibility—are defined as core criteria for evaluating IML explanations.
Generalizability can align with traditional model evaluation for intrinsic-global explanations and with surrogate proxies for posthoc-global explanations.
Fidelity measures the faithfulness of explanations to the target system, with ablation/perturbation methods used for posthoc-local explanations.
Persuasibility assesses human usefulness and comprehensibility, often requiring human studies or annotations.
A unified hierarchical framework is proposed to organize evaluation from bottom (generalizability) to top (persuasibility), tailored to developer vs. end-user needs.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。