[论文解读] Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a Large Language Model Based on Group-Level Demographic Information
该论文提出随机硅采样方法,通过赋予大型语言模型群体层级人口统计信息来模拟人类子群体意见,并评估其再现美国各人口统计群体和主题的实际公众意见分布的能力。
Large language models exhibit societal biases associated with demographic information, including race, gender, and others. Endowing such language models with personalities based on demographic data can enable generating opinions that align with those of humans. Building on this idea, we propose "random silicon sampling," a method to emulate the opinions of the human population sub-group. Our study analyzed 1) a language model that generates the survey responses that correspond with a human group based solely on its demographic distribution and 2) the applicability of our methodology across various demographic subgroups and thematic questions. Through random silicon sampling and using only group-level demographic information, we discovered that language models can generate response distributions that are remarkably similar to the actual U.S. public opinion polls. Moreover, we found that the replicability of language models varies depending on the demographic group and topic of the question, and this can be attributed to inherent societal biases in the models. Our findings demonstrate the feasibility of mirroring a group's opinion using only demographic distribution and elucidate the effect of social biases in language models on such simulations.
研究动机与目标
- 通过解决与人口统计信息相关的社会偏见来为研究提供动机。
- 提出随机硅采样,以生成与人群人口统计分布一致的调查问卷回应。
- 评估该方法在多个人口统计子群体和主题问题上的适用性。
- 检查LLM中固有的社会偏见如何影响模拟意见的可重复性。
提出的方法
- 仅基于人群的人口统计分布,使用LLM生成对应于该人群的调查问卷回应。
- 以人口统计数据为模型赋予个性,以引出意见分布。
- 评估生成的回应分布与实际美国公众舆论调查的吻合程度。
- 分析方法在不同人口统计群体和问题主题上的可重复性。
实验结果
研究问题
- RQ1在仅提供群体层级人口统计信息的情况下,LLM能否生成与人群子群体相一致的调查问卷回应分布?
- RQ2由于模型偏见,模拟的意见在不同人口统计群体和主题上可重复性是否存在差异?
- RQ3随机硅采样在多大程度上能再现跨人口子群体的真实公众意见分布?
- RQ4LLM中的社会偏见对模拟子群体意见有何影响?
主要发现
- 在受群体层级人口统计信息引导时,语言模型生成的回应分布与美国实际公众舆论调查极为相似。
- 模拟的意见的可重复性因人口统计群体和问题主题而异。
- 语言模型固有的偏见影响子群体意见模拟的保真度与一致性。
- 该方法在仅通过人口统计分布即可实现对一个群体意见的映照方面显示出可行性,同时揭示偏见效应。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。