[Paper Review] The Clever Hans Mirage: A Comprehensive Survey on Spurious Correlations in Machine Learning
This survey formally defines spurious correlations in ML, surveys taxonomy and methods to mitigate them, and discusses datasets, metrics, and future challenges.
Back in the early 20th century, a horse named Hans appeared to perform arithmetic and other intellectual tasks during exhibitions in Germany, while it actually relied solely on involuntary cues in the body language from the human trainer. Modern machine learning models are no different. These models are known to be sensitive to spurious correlations between non-essential features of the inputs (e.g., background, texture, and secondary objects) and the corresponding labels. Such features and their correlations with the labels are known as "spurious" because they tend to change with shifts in real-world data distributions, which can negatively impact the model's generalization and robustness. In this paper, we provide a comprehensive survey of this emerging issue, along with a fine-grained taxonomy of existing state-of-the-art methods for addressing spurious correlations in machine learning models. Additionally, we summarize existing datasets, benchmarks, and metrics to facilitate future research. The paper concludes with a discussion of the broader impacts, the recent advancements, and future challenges in the era of generative AI, aiming to provide valuable insights for researchers in the related domains of the machine learning community.
Motivation & Objective
- Provide a formal definition of spurious correlations in ML.
- Offer a comprehensive taxonomy of state-of-the-art mitigation methods.
- Summarize datasets, benchmarks, and evaluation metrics for spurious correlations.
- Discuss challenges, future directions, and the role of foundation models in this area.
Proposed method
- Introduce a formal definition of spurious correlations with group labels (y,a) and group set G = Y mp; A.
- Classify mitigation methods into data manipulation, representation learning, learning strategy, and other methods.
- Survey data augmentation, concept/pseudo-label discovery, causal intervention, invariant learning, feature disentanglement, and contrastive learning.
- Discuss optimization-based methods, ensemble learning, identification-then-mitigation, finetuning strategies, and adversarial training.
- Provide an overview of datasets and metrics, emphasizing worst-group accuracy as a robustness measure.

Experimental results
Research questions
- RQ1What is the formal definition of spurious correlations in ML and how can they be detected and characterized?
- RQ2What taxonomy best organizes current methods to mitigate spurious correlations across data manipulation, representation learning, and learning strategies?
- RQ3What datasets and metrics are used to evaluate robustness to spurious correlations, and what are their trade-offs?
- RQ4What are the key challenges and future directions, including group-label dependencies and foundation models, in mitigating spurious correlations?
Key findings
- A formal definition of spurious correlations is provided, including a mapping from spurious attributes to groups.
- A comprehensive taxonomy of mitigation approaches is proposed, spanning data manipulation, representation learning, learning strategy, and other methods.
- The survey summarizes common datasets and metrics used to evaluate worst-group performance and related measures.
- It discusses the trade-offs between worst-group and average accuracy and highlights challenges such as group-label dependence and scalability.
- Foundational discussions connect spurious correlations to domain generalization, invariant learning, Group Robustness, and shortcut learning.

Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.