QUICK REVIEW

[論文レビュー] Should I Follow AI-based Advice? Measuring Appropriate Reliance in Human-AI Decision-Making

Max Schemmer, Patrick Hemmer|arXiv (Cornell University)|Apr 14, 2022

Ethics and Social Impacts of AI被引用数 31

ひとこと要約

AIアドバイスに対する適切な依存（AR）の二次元測定概念を提案し、欺瞞的なホテルレビューとXAI説明を用いた逐次タスク研究でそれを実証する。

ABSTRACT

Many important decisions in daily life are made with the help of advisors, e.g., decisions about medical treatments or financial investments. Whereas in the past, advice has often been received from human experts, friends, or family, advisors based on artificial intelligence (AI) have become more and more present nowadays. Typically, the advice generated by AI is judged by a human and either deemed reliable or rejected. However, recent work has shown that AI advice is not always beneficial, as humans have shown to be unable to ignore incorrect AI advice, essentially representing an over-reliance on AI. Therefore, the aspired goal should be to enable humans not to rely on AI advice blindly but rather to distinguish its quality and act upon it to make better decisions. Specifically, that means that humans should rely on the AI in the presence of correct advice and self-rely when confronted with incorrect advice, i.e., establish appropriate reliance (AR) on AI advice on a case-by-case basis. Current research lacks a metric for AR. This prevents a rigorous evaluation of factors impacting AR and hinders further development of human-AI decision-making. Therefore, based on the literature, we derive a measurement concept of AR. We propose to view AR as a two-dimensional construct that measures the ability to discriminate advice quality and behave accordingly. In this article, we derive the measurement concept, illustrate its application and outline potential future research.

研究の動機と目的

AI助言における適切な依存（AR）を、正しいAI助言と誤ったAI助言を見分け、その識別に基づいて行動する能力として定義する。
RAIRとRSRを用いたARの二次元測定を提案する（relative positive AI-relianceおよび相対的正の自己依存）。
ホテルレビュー分類タスクにおけるAI助言と説明（XAI）を含む行動実験で測定概念を示す。

提案手法

ARを自動化と組織心理学の文献に基づく二次元構成として導出する。
RAIRとRSRを、識別と適応行動を捉える比率ベースの指標として定義する。
人間の初期判断、AI助言、助言後の人間の判断という逐次意思決定設定を採用する。
86%の精度を持つAI予測子としてサポートベクターマシンを用いた欺瞞ホテルレビューのデータセットを使用する。
XAI処理にLIMEベースの説明を適用し、それらがARに及ぼす影響を検討する。
RAIRとRSRをランダム基準と比較してARを評価し、処置効果を分析する。

実験結果

リサーチクエスチョン

RQ1AI助言におけるARを厳密な二次元的手法でどのように測定できるか？
RQ2説明（XAI）が人間がAI助言を見分け、判断を調整する能力に与える影響はどのようなものか？
RQ3正しい/誤ったAI助言の存在下で、AI依存（positive AI-reliance）と自己依存（positive self-reliance）のARの次元は異なる反応を示すか？
RQ4提案されたARフレームワークは、ヒューマン-AI意思決定における過小依存または過剰依存を識別できるか？

主な発見

AI条件では、参加者の相対的正の自己依存（RSR）は0.72（±0.03）、相対的正のAI依存（RAIR）は0.30（±0.03）を示した。
XAI条件では、RAIRが0.39（±0.03）に上昇し、RSRは0.72（±0.03）のままであった。
XAIによるRAIRの上昇は統計的に有意だった（t = -1.95, p = 0.05）。
説明は過小依存を減少させる可能性があり、過度の依存を引き起こすことはないことを示唆し、AR指標に対するXAIの影響は微妙である。
本研究は、設計選択（XAIなど）が識別とその後の依存行動にどのような影響を与えるかを分析するための二次元AR測定の有用性を示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。