QUICK REVIEW

[Paper Review] Learning high-dimensional DAGs with latent and selection variables

Diego Colombo, Marloes H. Maathuis|arXiv (Cornell University)|Apr 29, 2011

Bayesian Modeling and Causal Inference3 references5 citations

TL;DR

This paper proposes Adaptive Anytime FCI (AAFCI) and Really Fast Causal Inference (RFCI), two algorithms that improve the efficiency of causal discovery in high-dimensional directed acyclic graphs (DAGs) with latent and selection variables. By dynamically setting the conditioning set cutoff based on initial skeleton learning and optimizing conditional independence testing, RFCI achieves significantly faster performance on large sparse graphs while preserving sound causal interpretations.

ABSTRACT

We consider the problem of learning causal information between random variables in directed acyclic graphs (DAGs) when allowing arbitrarily many latent and selection variables. The Fast Causal Inference algorithm (FCI) (Spirtes et al., 1999) has been explicitly designed to infer conditional independence and causal information in such settings. Despite its name, FCI is computationally very intensive for large graphs. Spirtes (2001) introduced a modified version of FCI, called Anytime FCI, which only performs conditional independence tests up to a pre-specified cutoff k. Anytime FCI is typically faster but less informative than FCI, but the causal interpretation of tails and arrowheads in its output is still sound. We propose an adaptation of Anytime FCI, called Adaptive Anytime FCI (AAFCI), where the cut-off k is set to the maximum size of the conditioning sets used to find the initial skeleton in FCI. Moreover, we propose a new algorithm, called Really Fast Causal Inference (RFCI), which has similar properties as AAFCI but is much faster for large sparse graphs. The complete paper is available at http://arxiv.org/abs/1104.5617.

Motivation & Objective

To address the computational inefficiency of FCI in large graphs with latent and selection variables.
To develop a faster alternative to FCI that maintains sound causal interpretations despite reduced conditioning set depth.
To improve scalability for high-dimensional sparse graphs while preserving the ability to infer conditional independence and causal structure.
To introduce a dynamic cutoff mechanism in Anytime FCI that adapts to the data-driven complexity of the initial skeleton.

Proposed method

Adaptive Anytime FCI (AAFCI) sets the cutoff k to the maximum size of conditioning sets used in the initial skeleton discovery phase of FCI.
The algorithm uses conditional independence tests up to this adaptive k to maintain accuracy while reducing computational load.
Really Fast Causal Inference (RFCI) is designed as a more efficient variant of AAFCI, optimizing test ordering and pruning strategies for large sparse graphs.
RFCI retains the same causal interpretation guarantees as FCI and AAFCI, ensuring that tails and arrowheads in the output graph are valid.
Both algorithms rely on the same underlying conditional independence testing framework as FCI but limit the depth of tests to improve runtime.
The methods are designed to scale to high-dimensional settings by reducing the number of conditional independence tests required.

Experimental results

Research questions

RQ1Can the computational cost of FCI be reduced without sacrificing the validity of causal interpretations in the presence of latent and selection variables?
RQ2Is it possible to dynamically set the conditioning set cutoff in Anytime FCI based on the data structure to improve efficiency and accuracy?
RQ3How can the efficiency of causal discovery be enhanced for large sparse graphs while preserving soundness of the output graph?
RQ4Can a faster algorithm be designed that maintains the same causal interpretation guarantees as FCI and Anytime FCI?

Key findings

RFCI achieves significantly faster runtime than FCI and AAFCI on large sparse graphs, making it practical for high-dimensional settings.
The adaptive cutoff in AAFCI improves efficiency by aligning the test depth with the actual complexity of the skeleton learning phase.
Both AAFCI and RFCI preserve the sound causal interpretation of tails and arrowheads in the output graph, ensuring validity.
The proposed algorithms maintain the same conditional independence and causal structure inference guarantees as FCI, despite reduced test depth.
RFCI is shown to be particularly effective in large sparse graphs, where it outperforms existing methods in speed while retaining correctness.
The dynamic cutoff mechanism in AAFCI leads to better performance than fixed-k Anytime FCI, as it adapts to the data's inherent structure.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.