QUICK REVIEW

[论文解读] Systemic Biases in Sign Language AI Research: A Deaf-Led Call to Reevaluate Research Agendas

Aashaka Desai, Maartje De Meulder|arXiv (Cornell University)|Mar 5, 2024

Hearing Impairment and Communication被引用 8

一句话总结

本论文对101个手语人工智能研究进行了系统综述，发现系统性偏见并主张聋人领导力重新引导研究议程。

ABSTRACT

Growing research in sign language recognition, generation, and translation AI has been accompanied by calls for ethical development of such technologies. While these works are crucial to helping individual researchers do better, there is a notable lack of discussion of systemic biases or analysis of rhetoric that shape the research questions and methods in the field, especially as it remains dominated by hearing non-signing researchers. Therefore, we conduct a systematic review of 101 recent papers in sign language AI. Our analysis identifies significant biases in the current state of sign language AI research, including an overfocus on addressing perceived communication barriers, a lack of use of representative datasets, use of annotations lacking linguistic foundations, and development of methods that build on flawed models. We take the position that the field lacks meaningful input from Deaf stakeholders, and is instead driven by what decisions are the most convenient or perceived as important to hearing researchers. We end with a call to action: the field must make space for Deaf researchers to lead the conversation in sign language AI.

研究动机与目标

促使对手语 AI 的研究问题与方法如何被听力者/非手语研究者塑造进行批判性审视。
识别数据集、注释和建模决策中的系统性偏见，这些偏见与聋人利益相关者的需求不一致。
评估当前手语 AI 研究中聋人参与与领导的程度。
提出具体变革以提升聋人领导力与包容性研究实践。

提出的方法

对2021–2023年的101篇手语AI论文（包括 arXiv 与同行评审论文）进行混合式文献综述，以识别偏见。
构建语料库及纳入标准以选择可接收手语模型（不包括人因因素和以生成为焦点的工作）。
开发代码簿以跟踪数据集、输入、标签、先验和动机陈述，采用双注释者编码并由第三方裁定。
分析作者的定位（所有作者均为 DHH）及其生活经历，以将偏见置于背景中理解。
对数据集、注释和建模决策的发现进行批判性综合，以评估与聋人利益相关者利益的一致性。

实验结果

研究问题

RQ1当前手语 AI 研究领域存在哪些系统性偏见？
RQ2数据集、注释方案和建模选择与聋人利益相关者的需求及手语现实如何对齐或不对齐？
RQ3聋人领导在塑造研究议程和方法方面参与的程度如何？
RQ4哪些具体行动可以推动该领域走向聋人主导、具道德基础的手语 AI 研究？

主要发现

有64篇论文围绕解决聋人沟通障碍展开，常常简化聋人经历，并将翻译置于手语本身作为语言的地位之上。
使用了43个不同的数据集，对未披露贡献者信息的担忧，以及对仅依赖口译者数据集可能错误代表聋人签署群体的担忧。
注释在很大程度上依赖 gloss，可能存在部分性以及文化/语言偏见，基于字幕的端到端翻译实践会引入额外偏见。
许多建模决策依赖非手语的预训练（如 ImageNet、Kinetics）或基于姿态的输入，缺乏跨手语/语言的定量偏见分析。
该领域缺乏有意的聋人领导和利益相关者参与，导致系统性偏见，风险使聋人社区边缘化。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。