Skip to main content
QUICK REVIEW

[论文解读] Residual Unfairness in Fair Machine Learning from Prejudiced Data

Nathan Kallus, Angela Zhou|arXiv (Cornell University)|Jun 7, 2018
Ethics and Social Impacts of AI参考文献 14被引用 44
一句话总结

本文表明,对被审查、带偏见的训练数据进行公平性调整,仍可能在目标人群中产生剩余的不公平,并提出通过样本重加权来评估并纠正此偏差,以 SQF 数据为例进行说明。

ABSTRACT

Recent work in fairness in machine learning has proposed adjusting for fairness by equalizing accuracy metrics across groups and has also studied how datasets affected by historical prejudices may lead to unfair decision policies. We connect these lines of work and study the residual unfairness that arises when a fairness-adjusted predictor is not actually fair on the target population due to systematic censoring of training data by existing biased policies. This scenario is particularly common in the same applications where fairness is a concern. We characterize theoretically the impact of such censoring on standard fairness metrics for binary classifiers and provide criteria for when residual unfairness may or may not appear. We prove that, under certain conditions, fairness-adjusted classifiers will in fact induce residual unfairness that perpetuates the same injustices, against the same groups, that biased the data to begin with, thus showing that even state-of-the-art fair machine learning can have a "bias in, bias out" property. When certain benchmark data is available, we show how sample reweighting can estimate and adjust fairness metrics while accounting for censoring. We use this to study the case of Stop, Question, and Frisk (SQF) and demonstrate that attempting to adjust for fairness perpetuates the same injustices that the policy is infamous for.

研究动机与目标

  • Formalize how systematic data censoring due to historical bias affects fairness of learned policies.
  • Characterize conditions under which residual unfairness arises after fairness adjustments.
  • Propose sample reweighting to estimate target-population fairness metrics under censoring.
  • Apply the framework to real-world data (SQF) to illustrate bias propagation and adjustment.
  • Provide guidance for evaluating and correcting fairness under MAR (missing at random) censoring.

提出的方法

  • Define a logging/censoring mechanism Z and a target population T to study fairness on the true population.
  • Use equality of opportunity and equalized odds as fairness criteria and derive score-based post-processing thresholds per group.
  • Introduce residual inequity of opportunity as a measure of fairness loss when training-data fairness does not translate to the target population.
  • Characterize residual unfairness via propositions on score distributions and first-order stochastic dominance (sufficient conditions for bias propagation).
  • Propose sample reweighting with propensity ratios ˜p(x, a) to estimate target-population metrics from censored data under MAR (Assumption MAR).
  • Demonstrate the approach with a Stop, Question, and Frisk (SQF) case study to reveal how censoring biases affect fairness adjustments.

实验结果

研究问题

  • RQ1Under what conditions does censoring of training data induce residual unfairness after applying fairness adjustments?
  • RQ2How can we quantify and detect residual inequity of opportunity between protected groups on the target population?
  • RQ3Can sample reweighting recover target-population fairness metrics when data are MAR and Z (logging) differs from T (target) populations?
  • RQ4How does a real-world biased policy (SQF) illustrate bias-in, bias-out in fair ML?

主要发现

  • Fairness adjustments based on censored data may yield residual unfairness that disfavors the same protected group that faced prejudice in the training data.
  • Propositions show that disparate benefit of the doubt in the logging policy can translate to nonzero inequity of opportunity on the target population.
  • Strong and weak disparate benefit of the doubt conditions imply most nontrivial derived equal-opportunity classifiers will be unfair under censoring.
  • MAR-based sample reweighting provides a method to estimate target-population true positive/false positive rates from censored data.
  • SQF analysis demonstrates that fairness corrections can perpetuate injustices when training and target populations differ geographically and demographically.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。