QUICK REVIEW

[论文解读] The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation

Sheza Munir, Benjamin Mah|arXiv (Cornell University)|Feb 11, 2026

Ethics and Social Impacts of AI被引用 0

一句话总结

该论文分析数据标注实践如何制造单一的真相，并主张建立多元、认知正义的标注基础设施。

ABSTRACT

In machine learning, "ground truth" refers to the assumed correct labels used to train and evaluate models. However, the foundational "ground truth" paradigm rests on a positivistic fallacy that treats human disagreement as technical noise rather than a vital sociotechnical signal. This systematic literature review analyzes research published between 2020 and 2025 across seven premier venues: ACL, AIES, CHI, CSCW, EAAMO, FAccT, and NeurIPS, investigating the mechanisms in data annotation practices that facilitate this "consensus trap". Our identification phase captured 30,897 records, which were refined via a tiered keyword filtration schema to a high-recall corpus of 3,042 records for manual screening, resulting in a final included corpus of 346 papers for qualitative synthesis. Our reflexive thematic analysis reveals that systemic failures in positional legibility, combined with the recent architectural shift toward human-as-verifier models, specifically the reliance on model-mediated annotations, introduce deep-seated anchoring bias and effectively remove human voices from the loop. We further demonstrate how geographic hegemony imposes Western norms as universal benchmarks, often enforced by the performative alignment of precarious data workers who prioritize requester compliance over honest subjectivity to avoid economic penalties. Critiquing the "noisy sensor" fallacy, where statistical models misdiagnose cultural pluralism as random error, we argue for reclaiming disagreement as a high-fidelity signal essential for building culturally competent models. To address these systemic tensions, we propose a roadmap for pluralistic annotation infrastructures that shift the objective from discovering a singular "right" answer to mapping the diversity of human experience.

研究动机与目标

Assess infrastructural barriers to realizing justice in data annotation across key ML/HCI venues.
Identify pre- and post-annotation decisions that erase subjective experiences and enforce a Western-centric ground truth.
Map how annotator selection, labor, and aggregation practices shape downstream models and inequities.
Propose a roadmap for pluralistic annotation infrastructures that center epistemic justice.

提出的方法

Conduct a structured literature review of 346 papers from 2020–2025 across ACL, AIES, CHI, CSCW, EAAMO, FAccT, and NeurIPS.
Apply PICOC criteria to select studies involving human or machine annotators and annotation processes.
Perform reflexive thematic synthesis to identify sociotechnical tensions and trends in annotation practices.
Develop a taxonomy of pre- and post-annotation decisions that erode or preserve subjective knowledge.
Synthesize a multi-stakeholder roadmap for shifting from extractive data labor to situated knowledge stewardship.

实验结果

研究问题

RQ1RQ1: How is annotator suitability conceptualized and implemented, and to what extent do methods account for cultural expertise or lived experience?
RQ2RQ2: How are consensus and label aggregation handled, and do methods distinguish between noise and epistemic plurality?

主要发现

Ground truth is a socio-technical artifact manufactured by architectural and governance choices in annotation pipelines.
Annotator positionality, labor dynamics, and Western-centric data practices systematically erase subjective voices and diverse perspectives.
The shift to human-as-verifier models creates anchoring bias and reduces human input to sporadic validation rather than substantive dissent.
Model-mediated annotation and synthetic data loops risk homogenizing perspectives and entrenching normative biases.
Geographic hegemony and infrastructure filters operationalize Western norms as universal ground truths, marginalizing Global South contexts.
Pluralistic aggregation, rationale-aware approaches, and deliberative annotation can preserve disagreement as a high-fidelity signal and support epistemic justice.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。