QUICK REVIEW

[论文解读] Unraveling Social Perceptions & Behaviors towards Migrants on Twitter

Aparup Khatua, Wolfgang Nejdl|arXiv (Cornell University)|Dec 4, 2021

Hate Speech and Cyberbullying Detection被引用 2

一句话总结

本文提出了一种细粒度的NLP框架，用于区分Twitter上针对移民的感知（同感/反感）与行为（团结/敌意），采用BERT + CNN模型，F1得分达到0.76。该研究通过分离感知与行为维度，重新定义了仇恨言论检测，提供了一种超越二元分类的细致工具，用于识别针对移民的负面情绪。

ABSTRACT

We draw insights from the social psychology literature to identify two facets of Twitter deliberations about migrants, i.e., perceptions about migrants and behaviors towards mi-grants. Our theoretical anchoring helped us in identifying two prevailing perceptions (i.e., sympathy and antipathy) and two dominant behaviors (i.e., solidarity and animosity) of social media users towards migrants. We have employed unsupervised and supervised approaches to identify these perceptions and behaviors. In the domain of applied NLP, our study offers a nuanced understanding of migrant-related Twitter deliberations. Our proposed transformer-based model, i.e., BERT + CNN, has reported an F1-score of 0.76 and outperformed other models. Additionally, we argue that tweets conveying antipathy or animosity can be broadly considered hate speech towards migrants, but they are not the same. Thus, our approach has fine-tuned the binary hate speech detection task by highlighting the granular differences between perceptual and behavioral aspects of hate speeches.

研究动机与目标

为弥补现有文献的不足，通过整体分析社交媒体上对移民的感知与行为，而非孤立关注某一方面。
基于社会心理学理论，概念化并识别针对移民的同感、反感、团结与敌意等不同社交媒体表达形式。
通过区分针对移民负面情绪的感知（态度性）与行为（行动导向）成分，改进仇恨言论检测。
开发并评估一种能够对与移民相关的Twitter内容进行细粒度分类的稳健NLP模型。
为政策制定者提供一种基于AI的框架，以检测有害感知，并支持干预措施，以对抗负面刻板印象并减少仇外心理。

提出的方法

采用社会心理学中的双重理论框架，定义四种类别：对移民的同感、反感、团结与敌意。
使用无监督与有监督的NLP技术处理2020年5月至9月期间的800,000条预处理过的与移民相关的推文。
应用零样本分类模型进行初始的感知与行为检测，无需微调。
训练并比较多种深度学习架构：Bi-LSTM、CNN以及基于变压器的模型（BERT与RoBERTa），并使用fastText词嵌入。
提出一种混合BERT + CNN架构，结合BERT的上下文嵌入与CNN的局部特征提取，以提升分类性能。
使用加权F1得分作为评估指标对模型进行优化，采用人工标注标签进行有监督的训练与测试。

实验结果

研究问题

RQ1社交媒体用户在Twitter上如何表达对移民的感知（同感 vs. 反感）与行为（团结 vs. 敌意）？
RQ2NLP模型在多大程度上能够区分推文中针对移民负面情绪的感知与行为维度？
RQ3所提出的BERT + CNN模型在分类针对移民的细粒度感知与行为方面，相较于其他最先进模型表现如何？
RQ4表达反感或敌意的推文是否可以有意义地区分，它们是否代表仇恨言论的不同形式？
RQ5在负面感知演变为有害行为之前进行检测与干预，其政策意义是什么？

主要发现

所提出的BERT + CNN模型表现最佳，加权F1得分为0.76，优于其他模型（包括Bi-LSTM、CNN、BERT与RoBERTa）。
针对移民负面情绪的感知与行为维度是截然不同的：反感（负面感知）与敌意（负面行为）并不等同，尽管两者都可能被标记为仇恨言论。
研究发现，对移民的支持性言论中仍可能包含冒犯性语言，表明情感倾向与语言毒性并非总是一致。
大量表达反感或敌意的推文中包含冒犯性语言，但并非所有冒犯性语言都针对移民——语境至关重要。
研究识别出，认为移民会减少本地人就业机会的负面感知广泛存在，但往往与事实不符，凸显了纠正性干预的必要性。
该框架使政策制定者能够早期检测有害感知，并通过‘警示标志’（如标注有争议的言论）或‘促进机制’（如放大团结言论）来影响社会行为。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。

[论文解读] Unraveling Social Perceptions &amp; Behaviors towards Migrants on Twitter