QUICK REVIEW

[論文レビュー] Large Language Models Cannot Explain Themselves

Advait Sarkar|arXiv (Cornell University)|May 7, 2024

Topic Modeling被引用数 6

ひとこと要約

本論文は、言語モデルが出力の機械的説明を提供できないと主張し、真の説明と区別するための用語として exoplanations を導入し、批判的思考を促進する設計上のガードレールと共同監査戦略を提案する。

ABSTRACT

Large language models can be prompted to produce text. They can also be prompted to produce "explanations" of their output. But these are not really explanations, because they do not accurately reflect the mechanical process underlying the prediction. The illusion that they reflect the reasoning process can result in significant harms. These "explanations" can be valuable, but for promoting critical thinking rather than for understanding the model. I propose a recontextualisation of these "explanations", using the term "exoplanations" to draw attention to their exogenous nature. I discuss some implications for design and technology, such as the inclusion of appropriate guardrails and responses when models are prompted to generate explanations.

研究の動機と目的

LLM の出力における機械的説明と exoplanations の区別を動機づける。
exoplanations による社会的被害と、AI の説明可能性における再文脈化の必要性を強調する。
意思決定支援と批判的思考を改善するためのガードレールと共同監査ツールを含む設計上の示唆を提案する。

提案手法

機械的説明と exoplanations を定義し、E 型出力が基盤となる機構を反映できない理由を説明する。
exoplanations は O と同じ予測プロセスによって生成され、モデル内部の根拠を欠くと主張する。
exoplanations の社会的・安全性への害と、誤情報に基づく意思決定の可能性について論じる。
リスクを緩和するための免責事項、ガードレール、共同監査アプローチなどの実践的な設計介入を提案する。

実験結果

リサーチクエスチョン

RQ1言語モデルの文脈における機械的説明と exoplanations の違いは何か？
RQ2なぜ exoplanations はユーザーを誤導し得るのか、彼らはどんな社会的リスクをもたらすのか？
RQ3有用な批判的思考支援を維持しつつ、exoplanations の害を緩和する設計戦略は何か？

主な発見

exoplanations はモデルの生成プロセスの根拠ある反映ではなく、予測の真の理由を誤って表すことがある。
exoplanations は虚偽の自信、批判的思考の低下、AI システムへの信頼の侵食を引き起こす可能性がある。
ガードレール、免責事項、共同監査ツールは、exoplanations に過度に依存せず出力を評価するのに役立つ。
適切に文脈づけられた場合、exoplanations はユーザーの反省を促し、批判的思考を支援するのに依然として有用であり得る。
本論文は、機械的忠実性よりも意思決定支援に焦点を当てた説明可能性の社会的構築を主張する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。