QUICK REVIEW

[论文解读] An Empirical Analysis of Fairness Notions under Differential Privacy

Anderson Santana de Oliveira, Caelin Kaplan|arXiv (Cornell University)|Feb 6, 2023

Privacy-Preserving Technologies in Data被引用 8

一句话总结

该论文实证研究了为 DP-SGD 进行架构优化的模型选择如何影响真实公平数据集上的公平性概念（人口统计平等、等区分性、预测平等），结果显示差异在 DP 下可能减少或对公平性几乎不产生影响，同时保持效用。

ABSTRACT

Recent works have shown that selecting an optimal model architecture suited to the differential privacy setting is necessary to achieve the best possible utility for a given privacy budget using differentially private stochastic gradient descent (DP-SGD)(Tramer and Boneh 2020; Cheng et al. 2022). In light of these findings, we empirically analyse how different fairness notions, belonging to distinct classes of statistical fairness criteria (independence, separation and sufficiency), are impacted when one selects a model architecture suitable for DP-SGD, optimized for utility. Using standard datasets from ML fairness literature, we show using a rigorous experimental protocol, that by selecting the optimal model architecture for DP-SGD, the differences across groups concerning the relevant fairness metrics (demographic parity, equalized odds and predictive parity) more often decrease or are negligibly impacted, compared to the non-private baseline, for which optimal model architecture has also been selected to maximize utility. These findings challenge the understanding that differential privacy will necessarily exacerbate unfairness in deep learning models trained on biased datasets.

研究动机与目标

研究 DP-SGD 如何影响具有受保护属性的数据集上的公平性概念。
评估对 DP 有所感知的最优体系结构是否缓解或加剧群体差异。
在最大化效用（ROC AUC）的同时，比较私有化与非私有化训练下的公平性指标。
使用在 ML 公平性文献中常用的、存在多个人群及其交叉的数据集。

提出的方法

配置一个可配置的前馈网络，结合 DP-SGD 并进行穷举超参数搜索以最大化 ROC AUC。
执行 5 折交叉验证并评估单独的保持集测试。
对 Baseline+DP 和 Best DP Model 设置使用总隐私预算约为 epsilon = 27 的 DP-SGD。
从相反类别评估三种公平性概念：人口统计平等（独立性）、等区分性（分离性）、预测平等（充分性）。
分析受保护属性的所有子群交集中的峰值差异（种族与性别）。
将结果以每个设置的 10 次训练的均值±标准差形式报告。

实验结果

研究问题

RQ1与非私有基线相比，选择为 DP-SGD 优化的体系结构是否能减少公平性差异？

主要发现

Dataset	Overall AUC (Baseline)	Overall AUC (Baseline+DP)	Overall AUC (Best DP Model)	AUC difference (Baseline vs Best DP)	Demographic parity difference (Baseline)	Demographic parity difference (Best DP Model)	Equalized odds difference (Baseline)	Equalized odds difference (Best DP Model)	Precision difference (Baseline)	Precision difference (Best DP Model)
ACS Emp.	0.8837 ± 0.0011	0.8110 ± 0.0062	0.8702 ± 0.0013	0.3401 ± 0.0875	0.4383 ± 0.0805	0.3154 ± 0.0359	0.5884 ± 0.1360	0.2874 ± 0.0534	0.5534 ± 0.0998	0.2968 ± 0.0674
ACS Inc.	0.8878 ± 0.0011	0.8155 ± 0.0045	0.8820 ± 0.0008	0.2546 ± 0.0569	0.4223 ± 0.0613	0.2556 ± 0.0490	0.4360 ± 0.0780	0.3756 ± 0.0019	0.3550 ± 0.0518	0.4032 ± 0.0271
LSAC	0.8343 ± 0.0029	0.7755 ± 0.0125	0.7962 ± 0.0077	0.0435 ± 0.0056	0.3064 ± 0.0653	0.1687 ± 0.0151	0.2548 ± 0.0862	0.1975 ± 0.0722	0.1688 ± 0.0485	0.2197 ± 0.0082
Adult	0.9056 ± 0.0011	0.8476 ± 0.0073	0.9005 ± 0.0009	0.1264 ± 0.0249	0.2750 ± 0.0155	0.2375 ± 0.0207	0.7845 ± 0.0492	0.8000 ± 0.0000	0.9400 ± 0.0966	0.8000 ± 0.000
Compas	0.6895 ± 0.0041	0.5349 ± 0.0359	0.6863 ± 0.0030	0.1162 ± 0.0273	0.5101 ± 0.0209	0.3694 ± 0.0230	0.5592 ± 0.0476	0.3726 ± 0.0375	0.3347 ± 0.0749	0.3168 ± 0.0467

最佳 DP 架构通常在多种公平性概念上相对于非私有基线降低差异。
DP-SGD 能在保持或接近 Baseline 的效用（ROC AUC）的同时缓解多数据集的群体差异。
在数据集（ACS Employment、ACS Income、LSAC、Adult、COMPAS）中，DP 训练的模型常表现出较低或与其非私有对手相似的公平性差异。
Baseline+DP（非优化 DP）通常同时降低效用与公平性指标，而最佳 DP 模型（带架构搜索的 DP）则改善公平性差异。
在 ACS Employment 与 ACS Income 数据集中对差异的改进尤为显著；Adult 数据集显示有限的差异减少但在 DP 下并未恶化；COMPAS 的模式较为敏感且多变。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。