[Paper Review] Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions
A survey that catalogs the choices, assumptions, and definitions of fairness in prediction-based decision systems, clarifying how data, models, and societal goals interact.
A recent flurry of research activity has attempted to quantitatively define "fairness" for decisions based on statistical and machine learning (ML) predictions. The rapid growth of this new field has led to wildly inconsistent terminology and notation, presenting a serious challenge for cataloguing and comparing definitions. This paper attempts to bring much-needed order. First, we explicate the various choices and assumptions made---often implicitly---to justify the use of prediction-based decisions. Next, we show how such choices and assumptions can raise concerns about fairness and we present a notationally consistent catalogue of fairness definitions from the ML literature. In doing so, we offer a concise reference for thinking through the choices, assumptions, and fairness considerations of prediction-based decision systems.
Motivation & Objective
- Explain the social and technical choices that underpin prediction-based decision systems.
- Systematize assumptions and decisions that affect fairness in ML, including data, models, and evaluation.
- Provide a notationally consistent catalogue of fairness definitions from the ML literature.
- Highlight gaps between mathematical formalism and broader social goals in fairness research.
Proposed method
- Present a structured taxonomy of the policy design choices that influence fairness (over-arching goals, population, and decision space).
- Decompose data bias into statistical bias (sampling and measurement) and societal bias, and discuss their impact on fairness.
- Summarize predictive modeling choices (data, model class, covariates) and evaluation assumptions that affect fairness.
- Review and organize a catalogue of fairness definitions from the literature, including confusion-matrix based, score-based, and sufficiency-type criteria.
- Discuss causal frameworks for fairness and identify tensions and impossibilities among definitions.
Experimental results
Research questions
- RQ1What choices and assumptions in prediction-based decision systems most influence fairness outcomes?
- RQ2How do statistical and societal biases in data interact with model choices to affect fairness across groups?
- RQ3What are the major fairness definitions in ML, and how do they relate to decision contexts and evaluation assumptions?
- RQ4To what extent do single-threshold and score-based fairness notions align with real-world decision objectives?
Key findings
- Fairness in ML requires careful alignment of social goals, population, and decision space with modeling and evaluation choices.
- Data bias consists of statistical bias (sampling/measurement) and societal bias, each with distinct implications for fairness.
- Multiple, sometimes conflicting, fairness definitions exist (e.g., error-rate balance, predictive parity, calibration) and no single definition is universally applicable.
- Single-threshold fairness ties to utility maximization under certain assumptions but depends on scoring quality and chosen utilities.
- Evaluation assumptions (no interference, symmetric harms, batch decisions) critically shape perceived fairness and outcomes.
- Causal and counterfactual analyses offer additional perspectives on fairness beyond purely statistical definitions.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.