QUICK REVIEW

[論文レビュー] On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial

Francesco Salvi, Manoel Horta Ribeiro|arXiv (Cornell University)|Mar 21, 2024

Misinformation and Its Impacts被引用数 14

ひとこと要約

本研究は、個人化の有無によるAI主導の説得を人間のディベートと比較し、個人情報を用いたGPT-4は人間より同意を高める効果が大きい一方、非個人化のAIは僅かな上乗せの説得力にとどまる、という結論を得た。

ABSTRACT

The development and popularization of large language models (LLMs) have raised concerns that they will be used to create tailor-made, convincing arguments to push false or misleading narratives online. Early work has found that language models can generate content perceived as at least on par and often more persuasive than human-written messages. However, there is still limited knowledge about LLMs' persuasive capabilities in direct conversations with human counterparts and how personalization can improve their performance. In this pre-registered study, we analyze the effect of AI-driven persuasion in a controlled, harmless setting. We create a web-based platform where participants engage in short, multiple-round debates with a live opponent. Each participant is randomly assigned to one of four treatment conditions, corresponding to a two-by-two factorial design: (1) Games are either played between two humans or between a human and an LLM; (2) Personalization might or might not be enabled, granting one of the two players access to basic sociodemographic information about their opponent. We found that participants who debated GPT-4 with access to their personal information had 81.7% (p < 0.01; N=820 unique participants) higher odds of increased agreement with their opponents compared to participants who debated humans. Without personalization, GPT-4 still outperforms humans, but the effect is lower and statistically non-significant (p=0.31). Overall, our results suggest that concerns around personalization are meaningful and have important implications for the governance of social media and the design of new online environments.

研究の動機と目的

構造化されたディベートにおける直接的な人間対話でのLLMの説得力を評価する。
個人データ主導のパーソナライズがLLMの説得力に与える影響を評価する。
複数の討論トピックと設定にわたってAI主導の説得と人間の説得を比較する。
オンラインディベートのための統制された、再現可能な実験フレームワークを事前登録して実装する。

提案手法

2x2設計の4つの処置条件のいずれかにランダム割り当てを行う、ウェブベースの多ラウンドディベートプラットフォーム。
処置にはHuman-Human、Human-AI、そして相手の属性を共有するパーソナライズされた変種が含まれる。
結論として、ディベート前後の提案への同意の変化を測定し、相手の側と一致するように変換して評価する。
前提同意の非比例効果を考慮しつつ、ディベート後の序数同意を分析するために部分比例オッズモデルを用いる。
議論可能で広く理解可能な命題を保証する、構造化された多段階の注釈プロセスによるトピック選択。

実験結果

リサーチクエスチョン

RQ1対話型ディベート環境におけるGPT-4と人間の相対的な説得力はどの程度か。
RQ2相手情報のパーソナライズは非パーソナライズ条件と比べてAI主導の説得力を高めるか。
RQ3双方がパーソナライズされる場合とされない場合のAI説得と人間の説得はどのように比較されるか。
RQ4オンラインディベートにおけるAIまたは人間の説得に対する脆弱性に対して、人口統計的要因は影響を及ぼすか。

主な発見

個人化を行ったGPT-4は、人間とのディベートに比べてディベート後のより高い同意のオッズを81.7%増加させる（p < 0.01）。
個人化なしでもGPT-4は人間を上回るが、有意差はない（p = 0.31）。
人間-AIのパーソナライズディベートは、GPT-4を用いた場合に正の有意な説得効果を示す（p = 0.04、Human-AI非パーソナライズに対して）。
パーソナライズされた人間の対戦相手は、意見の急進化傾向を示すが有意とはならない（p = 0.38）。
全体として、個人データを用いたAIのマイクロターゲティングは、非パーソナライズAIおよび人間のマイクロターゲティングの双方をオンライン対話で有意に上回る可能性がある。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。