QUICK REVIEW

[论文解读] Quantifying and Attributing Polarization to Annotator Groups

Dimitris Tsirmpas, John Pavlopoulos|arXiv (Cornell University)|Jan 16, 2026

Hate Speech and Cyberbullying Detection被引用 0

一句话总结

简要直接回答摘要：本文引入 apunim 指标来量化并将注释者极化归因于子组，涵盖单任务与多标签任务，附带统计显著性检验和开源库。

ABSTRACT

Current annotation agreement metrics are not well-suited for inter-group analysis, are sensitive to group size imbalances and restricted to single-annotation settings. These restrictions render them insufficient for many subjective tasks such as toxicity and hate-speech detection. For this reason, we introduce a quantifiable metric, paired with a statistical significance test, that attributes polarization to various annotator groups. Our metric enables direct comparisons between heavily imbalanced sociodemographic and ideological subgroups across different datasets and tasks, while also enabling analysis on multi-label settings. We apply this metric to three datasets on hate speech, and one on toxicity detection, discovering that: (1) Polarization is strongly and persistently attributed to annotator race, especially on the hate speech task. (2) Religious annotators do not fundamentally disagree with each other, but do with other annotators, a trend that is gradually diminished and then reversed with irreligious annotators. (3) Less educated annotators are more subjective, while educated ones tend to broadly agree more between themselves. Overall, our results reflect current findings around annotation patterns for various subgroups. Finally, we estimate the minimum number of annotators needed to obtain robust results, and provide an open-source Python library that implements our metric.

研究动机与目标

在主观任务中分析跨组注释模式需要超越一致性度量的动机。
为将极化归因于注释者子组定义一个形式框架。
提出并用统计显著性检验对 apunim 指标进行验证。
在毒性与仇恨言论数据集上演示该方法并提供可复现的工具。

提出的方法

按个人特征定义注释者分组，并将模型注释记为 A(c) 及其组标签。
使用归一化的离单峰距离（nDFU）来衡量每个项和每个子组的极化。
以匹配组大小的随机分层划分的平均 nDFU 计算 apriori 极化 P_apr。
将观察极化 P_obs(d)(θ) 定义为在筛选集合 S_d 中数据集项上 A(c|θ) 的平均 nDFU。
定义 apunim(θ) = (P_obs^d(θ) − P_apr^d) / (1 − P_apr^d) 以量化归因强度。
通过一种类似置换的重采样算法（算法 1）使用学生 t 检验比较观察到的 apunim 与随机分区，以估计 p 值。
通过阈值 α 对极化项进行筛选并要求每个项有多个人注释者组以降低噪声。
提供实现该度量及其显著性检验的开源 Python 库，并附带可复现代码。

实验结果

研究问题

RQ1注释任务中的极化是否可以归因于特定的注释者子组，且超出随机 chance？
RQ2apunim 如何量化不同社会人口统计或意识形态群体的极化归因程度及方向？
RQ3与逐项分析相比，跨大量项的聚合是否能稳定极化归因？
RQ4进行稳健极化估计所需的最小注释者/样本量？
RQ5有序的社会人口统计属性如何影响跨数据集的极化模式？

主要发现

在仇恨言论任务中，注释者的种族/民族身份对极化有显著解释力，且在多个数据集上显现。
宗教信仰的注释者通常与其他注释者存在分歧，但彼此之间的分歧不如与其他组的广泛，且随组别与数据集的变化而演变。
受教育程度较低的注释者表现出更强的主观性，而教育程度较高的注释者彼此之间更易达成一致。
某些维度在各子组之间的和为零，表示没有系统性影响；而其他维度对整体现数据集极化有不对称影响。
DICES-350、DICES-990、Sap 的极化归因对种族/民族身份最强；Kumar 显示出如针对个体的正向贡献与跨性别者的负向贡献等系统性不对称。
发布了一个开源库与复现代码，以便将 apunim 应用于实际并复现结果。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。