QUICK REVIEW

[論文レビュー] Metrics for Explainable AI: Challenges and Prospects

Robert R. Hoffman, Shane T. Mueller|arXiv (Cornell University)|Dec 11, 2018

Explainable Artificial Intelligence (XAI)参考文献 107被引用数 220

ひとこと要約

本論文は、説明品質の測定、ユーザーの満足度と理解、好奇心に基づく説明探索、適切な信頼と依存、そして全体としての人間-Explainable AI（人間とAIのシステム）の性能を評価する方法を検討することで、XAIを評価する方法を探る。心理測定と文献統合の知見を活用。

ABSTRACT

The question addressed in this paper is: If we present to a user an AI system that explains how it works, how do we know whether the explanation works and the user has achieved a pragmatic understanding of the AI? In other words, how do we know that an explanainable AI system (XAI) is any good? Our focus is on the key concepts of measurement. We discuss specific methods for evaluating: (1) the goodness of explanations, (2) whether users are satisfied by explanations, (3) how well users understand the AI systems, (4) how curiosity motivates the search for explanations, (5) whether the user's trust and reliance on the AI are appropriate, and finally, (6) how the human-XAI work system performs. The recommendations we present derive from our integration of extensive research literatures and our own psychometric evaluations.

研究の動機と目的

XAI（Explainable AI）における測定の必要性を動機づけ、実務的なユーザー理解を確保する。
XAI評価の主要な測定目標を特定する（説明の良さ、ユーザー満足、理解、好奇心、信頼/依存、システム性能）。
広範な文献と心理測定の研究から洞察を統合し、評価実践を導く。

提案手法

XAIと測定に関する多様な研究文献を統合する。
心理測定評価とユーザースタディに基づく評価概念と領域を提案。
ユーザー–AIの説明ループの多面的な評価に関する推奨を提供。

実験結果

リサーチクエスチョン

RQ1AIシステムによって提供される説明の良さをどのように評価できるか？
RQ2説明に対するユーザーの満足度はどの程度で、それは理解とどう関連するか？
RQ3説明はどの程度好奇心を刺激し、さらなる情報の探索を促すか？
RQ4説明を考慮して、ユーザーの信頼とAIシステムへの依存は適切か？
RQ5人間–XAI作業システムを全体としてどのように評価すべきか？

主な発見

説明品質、ユーザー満足、理解、好奇心、信頼/依存、システムレベルの性能を網羅する測定目標のセットを提供。
広範な文献と心理測定評価の統合を主張し、XAI測定の実践的推奨を導出。
統合された証拠と評価パラダイムに基づくXAI有効性評価の具体的な推奨を提供。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。