QUICK REVIEW

[Paper Review] Robust Optimization for Fairness with Noisy Protected Groups

Serena Wang, Wenshuo Guo|arXiv (Cornell University)|Feb 21, 2020

Ethics and Social Impacts of AI52 references41 citations

TL;DR

The paper analyzes how to enforce group-based fairness when protected-group labels are noisy, and presents two robust optimization approaches (DRO and soft group assignments) that guarantee fairness on true groups while minimizing training loss. Empirical case studies show these methods outperform naive approaches, especially as noise increases, at the cost of higher error rates than naïve methods.

ABSTRACT

Many existing fairness criteria for machine learning involve equalizing some metric across protected groups such as race or gender. However, practitioners trying to audit or enforce such group-based criteria can easily face the problem of noisy or biased protected group information. First, we study the consequences of naively relying on noisy protected group labels: we provide an upper bound on the fairness violations on the true groups G when the fairness criteria are satisfied on noisy groups $\hat{G}$. Second, we introduce two new approaches using robust optimization that, unlike the naive approach of only relying on $\hat{G}$, are guaranteed to satisfy fairness criteria on the true protected groups G while minimizing a training objective. We provide theoretical guarantees that one such approach converges to an optimal feasible solution. Using two case studies, we show empirically that the robust approaches achieve better true group fairness guarantees than the naive approach.

Motivation & Objective

Motivate and formalize the problem of fairness with noisy protected groups in binary classification.
Provide theoretical bounds showing limitations of naïve approaches that use noisy groups only.
Develop two robust optimization frameworks (DRO and soft group assignments) that guarantee fairness on true groups while optimizing loss.
Provide convergence guarantees for at least one approach and practical algorithms for implementation.
Empirically compare naive and robust methods on UCI datasets under varying noise levels.

Proposed method

Formulate fairness-constrained training with true groups G and noisy groups Ĝ during training.
Show that enforcing fairness on Ĝ implies bounded violations on G if TV distance between conditional distributions is bounded (Theorem 1).
Introduce Distributionally Robust Optimization (DRO) to bound worst-case fairness violation under TV-ball around Ĝ-distributions, ensuring g_j(θ) ≤ 0 for all true groups.
Propose Soft Group Assignments (robust fairness with soft labels) using a noise model P(G=j|Ĝ=k) estimated from auxiliary data, casting constraints as max_w∈W(θ) g_j(θ,w) ≤ 0 and solving via Lagrangian methods.
Provide both an ideal algorithm with convergence guarantees and a practical gradient-based algorithm for the soft-assignments approach.
Discuss practical considerations, including loss function, linear programming subproblems, and convergence properties.

Experimental results

Research questions

RQ1How do fairness constraints learned on noisy protected groups Ĝ relate to true fairness constraints on G?
RQ2Can we guarantee fairness with respect to the true groups G when protected-group labels are noisy?
RQ3How can knowledge of the noise model between G and Ĝ be leveraged to enforce true-group fairness more tightly than naive approaches?
RQ4What are the trade-offs in accuracy (error rate) when using robust fairness methods under varying noise levels?
RQ5Do DRO and soft group assignment approaches converge to feasible, optimal solutions under realistic settings?

Key findings

Enforcing fairness on noisy groups Ĝ yields bounded fairness violations on true groups G if the total-variation distance between p_j and p̂_j is bounded (Theorem 1).
A naive approach that uses only Ĝ can lead to fairness violations on G that grow with noise; robust methods mitigate this risk.
Two robust approaches guarantee fairness on the true groups G while optimizing training loss: (i) distributionally robust optimization (DRO) and (ii) soft group assignments using a noise model P(G=j|Ĝ).
DRO provides a conservative but principled bound via TV-distance-based uncertainty sets around Ĝ-distributions; soft group assignments offer a less conservative, more model-aware alternative.
Empirically, on Adult (equality of opportunity) and Credit (equalized odds) datasets, the robust methods satisfy true-group fairness on average across noise levels, while incurring higher test error rates than naïve approaches; SA often yields lower error than DRO.
DRO tends to be more conservative (higher error rates) than soft assignments, especially as noise increases.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.