QUICK REVIEW

[论文解读] Can Large Language Model Agents Simulate Human Trust Behavior?

Chengxing Xie, Canyu Chen|arXiv (Cornell University)|Feb 7, 2024

Topic Modeling被引用 6

一句话总结

本论文显示，大型语言模型代理在信任博弈中表现出信任行为，并且能够与人类信任高度一致，从而在潜在模拟人类信任的同时揭示偏见和易受操纵性。

ABSTRACT

Large Language Model (LLM) agents have been increasingly adopted as simulation tools to model humans in social science and role-playing applications. However, one fundamental question remains: can LLM agents really simulate human behavior? In this paper, we focus on one critical and elemental behavior in human interactions, trust, and investigate whether LLM agents can simulate human trust behavior. We first find that LLM agents generally exhibit trust behavior, referred to as agent trust, under the framework of Trust Games, which are widely recognized in behavioral economics. Then, we discover that GPT-4 agents manifest high behavioral alignment with humans in terms of trust behavior, indicating the feasibility of simulating human trust behavior with LLM agents. In addition, we probe the biases of agent trust and differences in agent trust towards other LLM agents and humans. We also explore the intrinsic properties of agent trust under conditions including external manipulations and advanced reasoning strategies. Our study provides new insights into the behaviors of LLM agents and the fundamental analogy between LLMs and humans beyond value alignment. We further illustrate broader implications of our discoveries for applications where trust is paramount.

研究动机与目标

在行为经济学框架下，调查LLM代理是否在信任博弈中表现出信任行为。
在关键因素和随时间的变化中，评估代理（LLM）信任与人类信任的行为对齐。
识别代理信任的内在属性，包括对人口统计偏见以及对人类与代理人信任的差异。
探索推理策略和外部提示如何影响代理信任，以及对人机协作的意义。

提出的方法

将LLM代理建模为 Belief-Desire-Intention (BDI)代理，以揭示信任博弈中的决策推理。
使用多样化的提示集合和53个生成的人物设定，以模拟代理的类似人类的变异性。
通过初始转账额（信任）和与BDI推理输出的一致性（理性）来评估信任博弈结果。
将代理信任与行为经济学中的人类基准进行比较，以定义行为对齐。
操纵情境（人口统计、受托人身份、明确指令，以及零-shot Chain-of-Thought）以研究代理信任的内在属性。

实验结果

研究问题

RQ1LLM代理是否在信任博弈中表现出信任行为（以正向转账衡量且BDI推理一致）？
RQ2代理信任在互惠预期、风险感知和利他偏好方面与人类信任的对齐程度如何？
RQ3在人口统计变异、受托人身份（代理 vs 人类）、明确操作、推理策略等条件下，代理信任出现了哪些内在属性？

主要发现

LLM代理通常表现出信任行为，在信任博弈中发送正向金额，BDI输出与最终决策一致。
代理信任在互惠预期、风险感知、利他偏好等方面与人类信任具有高度行为对齐，特别是对于如GPT-4这样的先进模型。
重复互动中的信任动态显示GPT-4比GPT-3.5更一致地呈现类人模式，表明认知能力影响对齐。
代理信任表现出人口统计偏见（例如在某些模型中对女性受托人转账更高）以及更倾向于信任人类而非代理。
对信任的显式操控通常比增强更容易被破坏，零-shot Chain-of-Thought推理可以影响信任决策，尽管效果因模型而异。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。