QUICK REVIEW

[论文解读] Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions

Hao Wang, Berk Ustun|arXiv (Cornell University)|Jan 29, 2019

Ethics and Social Impacts of AI被引用 26

一句话总结

本文提出一种方法，通过为劣势群体学习反事实输入分布，减轻黑箱机器学习分类器中的差异影响，利用影响函数的下降法在不重新训练的情况下最小化公平性度量。该方法实现预处理，以最小的准确率损失提升公平性，在真实世界数据集上显著减少了差异性。

ABSTRACT

When the performance of a machine learning model varies over groups defined by sensitive attributes (e.g., gender or ethnicity), the performance disparity can be expressed in terms of the probability distributions of the input and output variables over each group. In this paper, we exploit this fact to reduce the disparate impact of a fixed classification model over a population of interest. Given a black-box classifier, we aim to eliminate the performance gap by perturbing the distribution of input variables for the disadvantaged group. We refer to the perturbed distribution as a counterfactual distribution, and characterize its properties for common fairness criteria. We introduce a descent algorithm to learn a counterfactual distribution from data. We then discuss how the estimated distribution can be used to build a data preprocessor that can reduce disparate impact without training a new model. We validate our approach through experiments on real-world datasets, showing that it can repair different forms of disparity without a significant drop in accuracy.

研究动机与目标

解决在无法重新训练（由于黑箱访问和数据隐私限制）的部署机器学习模型中存在差异影响的问题。
开发一种仅通过修改劣势群体的输入分布而非模型本身来修复公平性的方法。
提供一种理论基础扎实、数据驱动的方法，以学习最小化公平性差异的反事实分布。
实现预处理，以在保持模型准确率的同时改善劣势群体的结果。
支持在医疗和信贷等公平性关键应用中对敏感属性的伦理和合法使用。

提出的方法

该方法将反事实分布定义为劣势群体输入分布的扰动，该扰动在固定分类器下最小化公平性度量（例如，FPR、DA）。
将其反事实分布的优化表述为在概率分布单纯形上的下降过程，利用影响函数计算梯度。
为关键公平性标准推导出影响函数的闭式估计器，从而实现从经验数据中的高效计算。
该方法使用训练数据的经验分布来估计梯度，并迭代优化反事实分布。
从估计的反事实分布构建预处理器，以在推理前转换劣势群体的输入。
通过在真实世界数据集和合成实验中对公平性度量（例如，FPR、DA）进行下降法验证该方法。

实验结果

研究问题

RQ1能否在不重新训练的情况下，学习到一种反事实输入分布，以减少黑箱分类器中的差异影响？
RQ2如何将影响函数适配到概率分布空间中，以计算公平性优化的梯度？
RQ3基于反事实分布的预处理能否在保持模型准确率的同时实现公平性改进？
RQ4在抽样不确定性下，估计的反事实分布的统计收敛行为如何？
RQ5当存在联合代理变量时，该方法是否优于代理变量移除方法？

主要发现

在合成实验中，下降过程成功将假阳性率（FPR）从29.1%降低至4.1%，证明了显著的公平性改进。
在联合代理场景中，仅移除一个代理变量（X₁）使差异性从14.0%上升至24.8%，表明单变量移除方法的局限性。
在联合代理实验中，应用所提出的预处理器将差异度量DA₀从14.0%降低至0.0%，证实了方法的有效性。
在真实世界数据集验证中，该方法在修复公平性的同时保持了高模型准确率。
经验收敛界表明，在抽样条件下，估计的影响函数和公平性度量是一致的，误差率按O(1/√n)缩放。
反事实分布并非唯一，但该方法能有效收敛至最小化所选公平性度量的解。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。