QUICK REVIEW

[论文解读] Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings

Dorottya Demszky, Nikhil Garg|arXiv (Cornell University)|Apr 2, 2019

Social Media and Politics参考文献 53被引用 46

一句话总结

论文提出一个 NLP 框架，用以分析社交媒体极化的四个语言维度——话题选择、框架、情感和 illocutionary 力——并应用于 4.4 million tweets about 21 mass shootings。

ABSTRACT

We provide an NLP framework to uncover four linguistic dimensions of political polarization in social media: topic choice, framing, affect and illocutionary force. We quantify these aspects with existing lexical methods, and propose clustering of tweet embeddings as a means to identify salient topics for analysis across events; human evaluations show that our approach generates more cohesive topics than traditional LDA-based models. We apply our methods to study 4.4M tweets on 21 mass shootings. We provide evidence that the discussion of these events is highly polarized politically and that this polarization is primarily driven by partisan differences in framing rather than topic choice. We identify framing devices, such as grounding and the contrasting use of the terms "terrorist" and "crazy", that contribute to polarization. Results pertaining to topic choice, affect and illocutionary force suggest that Republicans focus more on the shooter and event-specific facts (news) while Democrats focus more on the victims and call for policy changes. Our work contributes to a deeper understanding of the way group divisions manifest in language and to computational methods for studying them.

研究动机与目标

理解语言如何在社交媒体上表达政治极化的动机
开发一个涵盖多种语言维度（话题、框架、情感、illocutionary 力）的综合分析框架
利用 Twitter 数据量化大规模枪击事件中的极化及其在事件内外的演变
识别枪手种族如何与框架和话题选择在极化话语中的互动
提供数据集和方法，以实现对语言极化的复现实证研究和进一步研究

提出的方法

为每个事件定义词汇表和基于标记的特征，并计算 leave-out partisanship 来衡量语言极化
开发基于嵌入的推文聚类方法，诱导具有凝聚力的、事件无关的话题，并与 MALLET 和 Biterm Topic Model (BTM) 进行比较
在基于 GloVe 的标记表示上应用 Arora 等人 (2017) 的句子嵌入，通过 k-means（余弦距离）对推文进行聚类
在带标签的话题数据上使用 leave-out 估计量计算同一话题内与跨话题的 partisan 分歧
将极化分解为话题层面和话题内分量，以评估极化是否由话题选择驱动，还是在话题内的框架驱动
分析枪手种族对框架和话题偏好的影响，使用 partisan log odds ratios 与上下文锚定

实验结果

研究问题

RQ1 Twitter 上关于大规模枪击事件的讨论在 partisan 线条上有多么极化？
RQ2极化在多大程度上来自话题选择而非话题内部的框架？
RQ3枪手的种族如何影响框架、话题偏好和情感表达？
RQ4在该领域中，哪些具体的框架手段和 illocutionary 力描述了党派话语？
RQ5情感与情态（illocutionary 劤力）是否对这些事件中的党派极化有贡献？

主要发现

关于大规模枪击事件的推文高度极化，leave-out partisanship 值在各事件中大致范围为 .517 到 .547
话题内极化随时间增加，而跨话题极化保持稳定
共和党人更关注枪手身份和新闻；民主党人更关注受害者和政策变化
框架手段如 grounding 及对诸如 “terrorist” 和 “crazy” 等术语的差异性使用取决于枪手种族，民主党更可能将白人枪手标记为恐怖分子，而共和党在不同模式下对非白人枪手的标记也类似
情感分析显示民主党表达更积极的情感、悲伤和信任，而共和党表达更多恐惧和厌恶，尤其当枪手是有色人种时
情态词（should, must, have to, need to）主要用于呼吁行动，民主党在跨话题中更倾向使用它们

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。