QUICK REVIEW

[Paper Review] Model-Agnostic Counterfactual Explanations for Consequential Decisions

Amir-Hossein Karimi, Gilles Barthe|arXiv (Cornell University)|May 27, 2019

Explainable Artificial Intelligence (XAI)26 references85 citations

TL;DR

Introduces MACE, a model-agnostic method that generates nearest, plausible, diverse counterfactual explanations for any predictor by solving a sequence of SMT satisfiability problems, achieving 100% coverage and closer counterfactuals than prior work.

ABSTRACT

Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide explanations that help the affected individuals not only to understand why a prediction was output, but also how to act to obtain a desired outcome. To this end, several works have proposed optimization-based methods to generate nearest counterfactual explanations. However, these methods are often restricted to a particular subset of models (e.g., decision trees or linear models) and differentiable distance functions. In contrast, we build on standard theory and tools from formal verification and propose a novel algorithm that solves a sequence of satisfiability problems, where both the distance function (objective) and predictive model (constraints) are represented as logic formulae. As shown by our experiments on real-world data, our algorithm is: i) model-agnostic ({non-}linear, {non-}differentiable, {non-}convex); ii) data-type-agnostic (heterogeneous features); iii) distance-agnostic ($\ell_0, \ell_1, \ell_\infty$, and combinations thereof); iv) able to generate plausible and diverse counterfactuals for any sample (i.e., 100% coverage); and v) at provably optimal distances.

Motivation & Objective

Motivate the need for explanations in consequential decisions (e.g., bail, loans) and the limitations of existing methods.
Propose MACE, a model-agnostic approach to generate nearest counterfactuals with guarantees.
Handle heterogeneous feature spaces and arbitrary distance measures while ensuring plausibility and diversity.
Provide a scalable workflow that yields provably optimal distance counterfactuals under SMT formulations.
Demonstrate practical effectiveness on real-world datasets with full coverage guarantees.

Proposed method

Represent the nearest counterfactual problem as a sequence of satisfiability problems by encoding the model, distance, plausibility, and diversity constraints as logic formulas.
Use characteristic formulas of models (constructed from program representations) to capture f(x)=y for different model classes (e.g., decision trees, MLPs).
Express distance functions and plausibility/diversity constraints as SMT-friendly programs, enabling model- and distance-agnostic optimization.
Apply a binary search over a distance threshold delta to approximate the nearest counterfactual with controllable accuracy, leveraging an SMT oracle for feasibility checks.
Allow diversity by adding constraints that enforce a minimum distance between multiple counterfactuals.
Support heterogeneous data types (numerical, categorical, ordinal) with a distance metric that normalizes per-feature changes and combines L0, L1, and Linf norms.

Experimental results

Research questions

RQ1How can counterfactual explanations be generated in a model-agnostic way for consequential decisions?
RQ2Can the nearest counterfactual under general distance measures be found with guarantees of optimality using SMT solvers?
RQ3How do plausibility and diversity constraints affect the quality and feasibility of counterfactuals?
RQ4How does MACE perform on real-world, heterogeneous datasets compared to existing approaches in terms of coverage and distance?
RQ5What are the trade-offs between actionability constraints (plausibility) and the resulting counterfactual distance?

Key findings

MACE achieves 100% coverage by design for the evaluated instances.
MACE yields significantly closer counterfactuals than prior approaches across datasets (Adult, Credit, COMPAS).
Average distance reductions when compared to baselines reach up to 70.2% (Adult), 75.4% (Credit), and 21.1% (COMPAS).
MACE supports heterogeneous feature spaces and arbitrary distance combinations (L0, L1, L∞) with optimal-distance guarantees.
Introducing plausibility constraints (e.g., non-change of immutable features like age) increases distance, especially under L1 and L∞ norms, but preserves feasible, policy-compliant explanations.
The approach provides 100% coverage and can generate diverse counterfactuals by incorporating diversity constraints.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.