QUICK REVIEW

[논문 리뷰] Simulating Social Media Using Large Language Models to Evaluate Alternative News Feed Algorithms

Petter Törnberg, Диляра Валеева|arXiv (Cornell University)|2023. 10. 05.

Computational and Text Analysis Methods인용 수 17

한 줄 요약

본 논문은 Large Language Models와 Agent-Based Modeling을 결합하여 서로 다른 뉴스 피드 알고리즘을 가진 세 가지 소셜 미디어 플랫폼을 시뮬레이션하고, 건너다리(bridging) 알고리즘이 독성을 줄이면서 당파 간 참여를 증가시키는지 테스트합니다. ANES 기반 페르소나와 GPT-3.5 프롬프트를 사용하여 one simulated day 동안 에이전트 상호작용을 생성하고 평가합니다.

ABSTRACT

Social media is often criticized for amplifying toxic discourse and discouraging constructive conversations. But designing social media platforms to promote better conversations is inherently challenging. This paper asks whether simulating social media through a combination of Large Language Models (LLM) and Agent-Based Modeling can help researchers study how different news feed algorithms shape the quality of online conversations. We create realistic personas using data from the American National Election Study to populate simulated social media platforms. Next, we prompt the agents to read and share news articles - and like or comment upon each other's messages - within three platforms that use different news feed algorithms. In the first platform, users see the most liked and commented posts from users whom they follow. In the second, they see posts from all users - even those outside their own network. The third platform employs a novel "bridging" algorithm that highlights posts that are liked by people with opposing political views. We find this bridging algorithm promotes more constructive, non-toxic, conversation across political divides than the other two models. Though further research is needed to evaluate these findings, we argue that LLMs hold considerable potential to improve simulation research on social media and many other complex social settings.

연구 동기 및 목표

LLM과 ABM으로 소셜 미디어를 시뮬레이션하는 것이 뉴스 피드 알고리즘이 대화 품질에 어떠한 영향을 주는지 밝힐 수 있는지 조사한다.
ANES 데이터를 사용해 미국 정치 인구통계와 미디어 소비를 반영하는 현실적인 에이전트 페르소나를 보정한다.
세 가지 피드 알고리즘을 테스트하여 교차 당파 참여도와 독성에 대한 영향을 평가한다.
대화 시뮬레이션 및 향후 인간 검증을 위한 예비 평가 프레임워크를 제공한다.

제안 방법

2020 ANES 데이터에서 추출된 정치적 신념, 인구통계, 비정치적 관심사를 포함한 500명의 LLM 기반 에이전트 페르소나를 생성한다.
ANES를 기반으로 에이전트의 게시 빈도와 뉴스 소스 소비를 할당하고, 2020년 7월 1일 헤드라인과 요약에서 에이전트당 15개의 이야기를 생성한다.
다음과 같은 서로 다른 타임라인을 가진 세 플랫폼 시뮬레이션을 실행한다: (1) 팔로우된 사람의 가장 좋아요를 받는 게시물만, (2) 모든 사용자의 높은 참여 게시물, (3) opposing 당파의 좋아요를 우선하는 bridging 알고리즘.
에이전트가 게시물 작성, 좋아요, 댓글 달기를 수행할 수 있도록 하고 피드 가시성은 참여 피드백에 따라 발전하여 라이브 타임라인을 모방한다.
Perspective API 독성과 댓글 및 좋아요에 대한 당파 간 상호작용 지수(E-I)를 측정한다.
LLM impersonation의 한계, 소규모 샘플, 회고적 학습 데이터, 인간 검증의 필요성을 인정한다.

Figure 1: Illustration of the model developed in this paper, which combines Large Language Models and Agent-Based Models to simulate the impact of bridging algorithms on social media discourse. Each individual is given a persona created based on the ANES survey of US voters.

실험 결과

연구 질문

RQ1LLM 기반 ABM이 소셜 미디어를 충분히 잘 시뮬레이션하여 대안 뉴스 피드 알고리즘을 비교할 수 있는가?
RQ2bridging 알고리즘이 독성을 증가시키지 않으면서 교차 당파 참여를 증가시키는가, 전통적인 피드 설계와 비교하여?
RQ3참여 중심의 타임라인이 미국 정치 맥락에서 독성 및 당파 간 대화에 어떤 영향을 미치는가?

주요 결과

플랫폼 1은 에코-챔버 역학으로 인해 낮은 독성과 미미한 당파 간 상호작용을 보인다.
플랫폼 2는 더 높은 당파 간 상호작용과 독성을 보여주며, 연구 기간 동안 트위터 수준의 독성에 근접한다.
플랫폼 3(bridging)은 가장 바람직한 결과를 보이며, 세 플랫폼 중 더 많은 당파 간 상호작용과 가장 낮은 독성을 나타낸다.
bridging 타임라인은 교차 주제를 강조하고 다른 모델에 비해 모욕적 언어를 감소시킨다.
정성적 예시는 bridging이 미디어 속 표현과 같은 주제에 대해 교차 당파 담론을 촉진할 수 있음을 시사한다.

Figure 2: Excerpt of the generated timeline from Platform 1.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.