QUICK REVIEW

[論文レビュー] An Evaluation of the Human-Interpretability of Explanation

Isaac Lage, Emily Chen|arXiv (Cornell University)|Jan 31, 2019

Explainable Artificial Intelligence (XAI)参考文献 59被引用数 121

ひとこと要約

本論文は、意思決定セットにおける説明の複雑さのさまざまなタイプが、タスクと領域を跨いだ人間の解釈可能性にどのように影響するかを実証的に研究し、認知的チャンクを使いやすさの主要な推進要因として特定する。

ABSTRACT

Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions. However, exactly what kinds of explanation are truly human-interpretable remains poorly understood. This work advances our understanding of what makes explanations interpretable under three specific tasks that users may perform with machine learning systems: simulation of the response, verification of a suggested response, and determining whether the correctness of a suggested response changes under a change to the inputs. Through carefully controlled human-subject experiments, we identify regularizers that can be used to optimize for the interpretability of machine learning systems. Our results show that the type of complexity matters: cognitive chunks (newly defined concepts) affect performance more than variable repetitions, and these trends are consistent across tasks and domains. This suggests that there may exist some common design principles for explanation systems.

研究の動機と目的

一般的な機械学習タスクにおいて、説明を人間が解釈しやすくする要因を調査する。
説明の性質（サイズ、認知的チャンク、反復）が使いやすさに与える影響を定量化する。
2つの領域（レシピ推薦と臨床意思決定）と3つのタスク（シミュレーション、検証、反事実）で解釈可能性を比較する。
意思決定セットの説明の解釈可能性を高める正則化項を特定する。

提案手法

機械学習出力を模倣する、制御された手作りの意思決定セットの説明を構築する。
3つの説明の変動次元（サイズ、認知的チャンク、反復語）を操作する。
2つの領域（レシピと臨床）と3つのタスク（シミュレーション、検証、反事実）で評価する。
3つの指標（正確性、反応時間、主観的満足度）でパフォーマンスを測定する。
MTurk から各実験につき150名の被験者を募集し、練習問題に基づく適格条件を適用する。

実験結果

リサーチクエスチョン

RQ1意思決定セットの説明のどの性質が、タスクや領域を跨いで人間の使いやすさに最も影響するか？
RQ2認知的チャンク、改行/語長、反復が、反応時間、正確性、満足度に異なる影響を与えるか？
RQ3解釈可能な説明の設計原則は領域やタスクを超えて存在するか？

主な発見

説明の複雑性が大きいほど、タスクや領域を問わず反応時間が一般に増加する。
認知的チャンク（新しい概念）は、語の単なる反復よりもパフォーマンスに大きな影響を与える。
明示的に定義された認知的チャンクは、暗黙的に埋め込まれたチャンクより反応時間を増やす傾向があり、スキャン/処理コストを示唆する。
説明サイズ（行数と出力語）による反応時間への影響は領域によって異なる。レシピではこれらの効果がより顕著である。
反復語は、新しい認知的チャンクを導入するよりも、反応時間に対する影響がより一貫性がなく、影響も小さいことを示した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。