QUICK REVIEW

[论文解读] Why Is My Classifier Discriminatory?

Irene A. Chen, Fredrik Johansson|arXiv (Cornell University)|May 30, 2018

Artificial Intelligence in Healthcare被引用 151

一句话总结

本文通过将公平性差距分解为偏差、方差和噪声来分析预测模型中的歧视，认为数据收集常常在不损害准确性的前提下减少歧视。

ABSTRACT

Recent attempts to achieve fairness in predictive models focus on the balance between fairness and accuracy. In sensitive applications such as healthcare or criminal justice, this trade-off is often undesirable as any increase in prediction error could have devastating consequences. In this work, we argue that the fairness of predictions should be evaluated in context of the data, and that unfairness induced by inadequate samples sizes or unmeasured predictive variables should be addressed through data collection, rather than by constraining the model. We decompose cost-based metrics of discrimination into bias, variance, and noise, and propose actions aimed at estimating and reducing each term. Finally, we perform case-studies on prediction of income, mortality, and review ratings, confirming the value of this analysis. We find that data collection is often a means to reduce discrimination without sacrificing accuracy.

研究动机与目标

在数据环境中推动公平性评估，而不仅仅约束模型。
提出基于成本的公平性中的歧视偏差-方差-噪声分解。
提供估计和减少每个歧视组成成分的程序。
展示数据收集和有针对性的变量收集如何在实际任务中降低歧视。

提出的方法

使用偏差-方差-噪声分解来区分成本基公平性（FPR、FNR 或 0-1 损失）下的歧视源。
定义随机训练集上的期望歧视并提供估计技术。
提出通过增加数据收集、子采样和聚类来识别需要新特征的子群体。
应用学习曲线建模来预测随着训练数据增加的歧视。
在收入预测、ICU 死亡率预测和书籍评分预测上进行案例研究以验证该方法。

实验结果

研究问题

RQ1对预测模型中导致歧视的不同来源（偏差、方差、噪声）有哪些？
RQ2在实践中如何估计并区分这些来源以实现基于成本的公平？
RQ3额外的数据收集或有针对性的特征收集是否能够在不牺牲准确性的情况下降低歧视？
RQ4在实际任务（收入、死亡率、评价）中，歧视及其驱动因素如何变化？

主要发现

歧视可以分解为偏差、方差和噪声，偏差或方差的差异指示模型或数据问题，噪声的差异指示缺失的预测变量。
增加训练数据同时降低假阳性率和假阴性率，从而降低收入预测中的歧视水平。
在若干任务中，不同群体的噪声估计存在差异，表明预测能力不平等（噪声）超出模型选择或数据规模对歧视的影响。
聚类可以识别具有高误差差距的子群体，指导有针对性的数据收集以降低歧视。
在 ICU 死亡率预测中，某些族裔群体显示出显著不同的误差率，主题建模揭示具有高差异的子群体。
书籍评分实验表明对代表性较少的性别进行有针对性的抽样可能消除均方误差中的部分歧视。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。