[Paper Review] Algorithmic recourse under imperfect causal knowledge: a probabilistic approach
The paper proposes two probabilistic methods to compute algorithmic recourse under imperfect causal knowledge: (i) Gaussian process-based counterfactuals for individualised recourse and (ii) subpopulation-based recourse via conditional average treatment effects using CVAEs, with a gradient-based optimization to select interventions.
Recent work has discussed the limitations of counterfactual explanations to recommend actions for algorithmic recourse, and argued for the need of taking causal relationships between features into consideration. Unfortunately, in practice, the true underlying structural causal model is generally unknown. In this work, we first show that it is impossible to guarantee recourse without access to the true structural equations. To address this limitation, we propose two probabilistic approaches to select optimal actions that achieve recourse with high probability given limited causal knowledge (e.g., only the causal graph). The first captures uncertainty over structural equations under additive Gaussian noise, and uses Bayesian model averaging to estimate the counterfactual distribution. The second removes any assumptions on the structural equations by instead computing the average effect of recourse actions on individuals similar to the person who seeks recourse, leading to a novel subpopulation-based interventional notion of recourse. We then derive a gradient-based procedure for selecting optimal recourse actions, and empirically show that the proposed approaches lead to more reliable recommendations under imperfect causal knowledge than non-probabilistic baselines.
Motivation & Objective
- Motivate the need for causal-aware algorithmic recourse that accounts for imperfect causal knowledge.
- Show that guarantees rely on knowing the true structural equations, which is generally infeasible.
- Develop two probabilistic approaches to select optimal recourse actions with high probability under limited causal knowledge.
- Provide a gradient-based optimization method to find cost-efficient actions that achieve recourse under uncertainty.
- Demonstrate empirically that probabilistic methods outperform non-probabilistic baselines in reliability and cost.
Proposed method
- Model uncertainty in structural equations via Gaussian process-based structural causal models (gp-scm) with additive Gaussian noise to obtain a distribution over counterfactuals.
- Compute counterfactual distributions for descendants of interventions using ancestral sampling and GP noise posteriors.
- Formulate a probabilistic recourse optimization objective that minimizes action cost under a constraint on the expected classifier outcome, using a lower-confidence bound as threshold.
- Introduce a subpopulation-based recourse via conditional average treatment effects (cate) using conditional variational autoencoders (cvae) to estimate interventional distributions.
- Identify interventional distributions P(X_d(I) | do(X_I=theta), X_nd(I)=x_nd) as observationally identifiable under causal sufficiency and model them with CVAEs.
- Solve the optimization problems with a gradient-based Lagrangian approach, differentiating through Monte Carlo estimates of the expectations.
Experimental results
Research questions
- RQ1Can recourse guarantees be achieved when the true structural equations are unknown?
- RQ2How can we compute reliable recourse actions under imperfect causal knowledge using probabilistic models?
- RQ3Do gp-scm-based counterfactuals and subpopulation-based interventional approaches yield lower-cost and higher-valid recourse than non-probabilistic baselines?
- RQ4Is a subpopulation-based notion of recourse (cate-based) preferable when structural equation forms are misspecified?
Key findings
- Probabilistic recourse methods yield higher validity and controlled cost under imperfect causal knowledge compared to non-probabilistic baselines.
- GP-based recourse (gp-scm) achieves robust validity (100% in some settings) with measurable lcb values and higher cost in certain scenarios.
- Subpopulation-based recourse via cate (cvae) demonstrates strong performance across different SCM families, often matching or approaching oracle performance.
- Across three-variable synthetic SCMs (linear, non-linear ANM, non-additive), probabilistic methods outperform point-based baselines in terms of reliability and cost.
- The gradient-based optimization efficiently finds high-quality recourse actions by differentiating through the model-based expectations.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.