QUICK REVIEW

[论文解读] Opinion Consensus Formation Among Networked Large Language Models

Iris Yazici, Mert Kayaalp|arXiv (Cornell University)|Jan 29, 2026

Opinion Dynamics and Social Influence被引用 0

一句话总结

论文研究基于大语言模型（LLM）的代理在 DeGroot 框架下在各种图拓扑上如何达成共识，发现共识存在但最终观点与 DeGroot 预测不一致，收敛速度与图特征特征值相关。

ABSTRACT

Can classical consensus models predict the group behavior of large language models (LLMs)? We examine multi-round interactions among LLM agents through the DeGroot framework, where agents exchange text-based messages over diverse communication graphs. To track opinion evolution, we map each message to an opinion score via sentiment analysis. We find that agents typically reach consensus and the disagreement between the agents decays exponentially. However, the limiting opinion departs from DeGroot's network-centrality-weighted forecast. The consensus between LLM agents turns out to be largely insensitive to initial conditions and instead depends strongly on the discussion subject and inherent biases. Nevertheless, transient dynamics align with classical graph theory and the convergence rate of opinions is closely related to the second-largest eigenvalue of the graph's combination matrix. Together, these findings can be useful for LLM-driven social-network simulations and the design of resource-efficient multi-agent LLM applications.

研究动机与目标

评估 DeGroot 共识是否能够预测多轮交互中 LLM 代理的群体行为。
研究网络拓扑与提示语如何影响 LLM 的观点收敛。
探索 LLM 代理最终观点中与主题相关的固有偏见。
量化收敛速率并将其与交互图的特征值相关联。
提供一个开放的、包含大量 LLM 网络实验的数据集，供后续研究使用。

提出的方法

将 LLM 代理建模为在有向加权图上交换文本信息的 DeGroot 参与者。
通过系统提示来强制网络权重和代理人个性；在实验中保持组合矩阵 A 固定。
使用单独的情感分析 LLM 将每条消息映射为观点分数，并输出归一化到 [0,1] 的分数。
使用 Erdős–Rényi 图（连通性 p）、一个全连接图以及一个环形拓扑来研究拓扑效应。
以可辩论立场（支持、中立、反对）初始化代理，并改变主题以研究偏差与收敛。
将数据集在 Hugging Face 开源，包含 764 次实验和超过 120 万条 LLM 响应。

Figure 1: The average standard deviation of agents’ opinions with respect to the number of iterations across $50$ bitcoin-related experiments. The shaded region indicates the standard error of the mean (SEM) across $50$ simulations.

实验结果

研究问题

RQ1多轮 LLM 互动在网络上是否按照 DeGroot 式更新收敛到共识？
RQ2图拓扑与交互权重如何影响收敛速率和 LLM 代理的最终观点？
RQ3最终观点是否与 DeGroot 的网络中心性加权预测一致，或受主题和预训练对齐的偏见影响？
RQ4收敛速率与组合矩阵的第二大特征值之间的经验关系是什么？
RQ5通过系统提示强制权重对共识形成有何影响？

主要发现

Experiment Type	Average STD ± SEM
Weighted Experiments	0.083 ± 0.004
Weightless Experiments	0.165 ± 0.008

代理通常收敛到共识，且分歧按指数衰减。
最终共识往往与 DeGroot 预测的网络中心性加权初始观点平均值不一致。
最终观点显示出受主题影响的偏见，受预训练和对齐偏见的影响。
收敛速率与经典图论一致，与组合矩阵的第二大特征值的大小相关。
通过系统提示强制权重提高达成共识的可能性。
在考虑第二特征值时，分歧半衰时间 t1/2 = ln(2)/(-ln|λ2|) 与理论关系相符。

Figure 2: Left : Initial opinion distributions, Right : Final opinion distributions. Each row belongs to a set of experiments with a different topic and initial opinion distribution. For example, the first row denotes a set of experiments where the initial opinion distribution is highly skewed towar

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。