[论文解读] Survey on Causal-based Machine Learning Fairness Notions
本文综述了基于因果的公平性概念,讨论可观测数据的可识别性与估计,并提供选择适当概念的指南,以及按 Pearl 的因果阶梯进行排名。
Addressing the problem of fairness is crucial to safely use machine learning algorithms to support decisions with a critical impact on people's lives such as job hiring, child maltreatment, disease diagnosis, loan granting, etc. Several notions of fairness have been defined and examined in the past decade, such as statistical parity and equalized odds. The most recent fairness notions, however, are causal-based and reflect the now widely accepted idea that using causality is necessary to appropriately address the problem of fairness. This paper examines an exhaustive list of causal-based fairness notions and study their applicability in real-world scenarios. As the majority of causal-based fairness notions are defined in terms of non-observable quantities (e.g., interventions and counterfactuals), their deployment in practice requires to compute or estimate those quantities using observational data. This paper offers a comprehensive report of the different approaches to infer causal quantities from observational data including identifiability (Pearl's SCM framework) and estimation (potential outcome framework). The main contributions of this survey paper are (1) a guideline to help selecting a suitable fairness notion given a specific real-world scenario, and (2) a ranking of the fairness notions according to Pearl's causation ladder indicating how difficult it is to deploy each notion in practice.
研究动机与目标
- 说明为何需要因果关系来解决超越可观测性概念的公平性问题。
- 给出基于因果的公平性概念的详尽综述(分析了19种)。
- 解释如何通过可识别性与估计框架从观测数据推断因果量。
- 提供在现实场景中选择合适的公平性概念的指南。
- 根据 Pearl 的因果阶梯对公平性概念进行排序,以指示部署难度。
提出的方法
- 描述因果框架(SCM 与潜在结果)及其等价性。
- 在 Pearl 的 SCM 中解释因果量的可识别性标准。
- 讨论潜在结果框架中的估计技术(如匹配、再加权)。
- 介绍并示例多种基于因果的公平性概念(如总效应、反事实公平、干预公平)。
- 提供一个决策图,根据现实场景引导概念选择。
- 将公平性概念在 Pearl 的因果阶梯上定位,以评估实际部署难度。
实验结果
研究问题
- RQ1存在哪些因果公平性概念,它们在因果性与相关性处理上的差异?
- RQ2如何从观测数据中识别并估计因果量(干预、反事实)以进行公平性评估?
- RQ3在特定现实场景和数据结构下,哪个公平性概念最合适?
- RQ4Pearl 的因果阶梯排序如何反映部署每种概念的实用性?
- RQ5在公平性分析中,SCM 与潜在结果框架之间的权衡是什么?
主要发现
- 19种基于因果的公平性概念是分析对象。
- 基于因果的概念需要不可观测的量,如干预和反事实,这可能无法从观测数据识别。
- 可识别性取决于因果图和是否存在不可观测的混淆。
- 一个指南(决策图)有助于在现实世界场景中选择合适的公平性概念。
- Pearl 的因果阶梯按可部署性和可识别性对概念进行排序。
- 潜在结果框架支持单位层面的因果推理与估计,而 SCM 有助于路径特定的因果分析与发现。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。