QUICK REVIEW

[论文解读] Auditing Disability Representation in Vision-Language Models

Srikant Panda, Sourabh Singh Yadav|arXiv (Cornell University)|Jan 24, 2026

Multimodal Machine Learning Applications被引用 0

一句话总结

本论文提出了一种对照的中性提示与残疾情境化提示框架，对9类残疾下的15个视觉-语言模型进行审计，结果显示残疾情境会降低解释保真度，但可通过提示和偏好微调来缓解。

ABSTRACT

Vision-language models (VLMs) are increasingly deployed in socially sensitive applications, yet their behavior with respect to disability remains underexplored. We study disability aware descriptions for person centric images, where models often transition from evidence grounded factual description to interpretation shift including introduction of unsupported inferences beyond observable visual evidence. To systematically analyze this phenomenon, we introduce a benchmark based on paired Neutral Prompts (NP) and Disability-Contextualised Prompts (DP) and evaluate 15 state-of-the-art open- and closed-source VLMs under a zero-shot setting across 9 disability categories. Our evaluation framework treats interpretive fidelity as core objective and combines standard text-based metrics capturing affective degradation through shifts in sentiment, social regard and response length with an LLM-as-judge protocol, validated by annotators with lived experience of disability. We find that introducing disability context consistently degrades interpretive fidelity, inducing interpretation shifts characterised by speculative inference, narrative elaboration, affective degradation and deficit oriented framing. These effects are further amplified along race and gender dimension. Finally, we demonstrate targeted prompting and preference fine-tuning effectively improves interpretive fidelity and reduces substantially interpretation shifts.

研究动机与目标

以基于残疾人权利与新闻标准为基础，对VLMs中的残疾 representation进行规范性评估。
开发专家验证的零-shot成对提示框架，以衡量解释保真度。
在9个残疾类别上基准测试15种开放源代码和闭源的VLMs。
提供实际的缓解策略，在不牺牲模型实用性的前提下，减少解释性漂移。

提出的方法

将残疾偏见定义为同一图像下NP与DP回答之间的差异。
使用PAIRS合成图像数据集，在9个残疾类别中实现受控的成对NP和DP提示。
采用LLM作为评判者，评估更高阶的偏见，如推测性推断、刻板印象和框架。
用VADER情感、Relate感知和冗长度等指标量化语言退化。
使用统计检验（ANOVA，p<0.05）和评注者/LMM一致性来验证发现。

Figure 1: Evaluation pipeline for auditing interpretive disability bias in Vision-Language models (VLMs) using paired Neutral and Disability Contextualized prompts and their corresponding responses.

实验结果

研究问题

RQ1残疾情境化提示是否会使VLM输出相对于中性提示产生解释性漂移？
RQ2漂移在不同残疾类别间有何差异，是否与种族和性别交叉相关？
RQ3在残疾情境下观察到的主要偏见形式（解释、刻板印象、框架）是什么？
RQ4提示策略和基于偏好的微调能否在不牺牲输出质量的前提下缓解这些偏见？

主要发现

残疾情境始终降低解释保真度，伴随推测性推断、叙事膨胀和情感戏剧性增强。
冗长度与认知维度在多数模型中表现出最大退化，在某些情况下甚至超过70–90%。
偏见效应在种族和性别轴上放大，白人男性的解释性漂移更强，黑人女性的描述则更受限。
通过有针对性的提示可以显著降低大多数模型中的偏见，尤其是在解释和框架方面。
直接偏好优化（DPO）比单纯提示具有更大、更加稳定的偏见降低效果，显著提升解释保真度。

Figure 2: Example image from PAIRS dataset (Category: Occupation, Subcategory: Desk)

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。