QUICK REVIEW

[Paper Review] Inference for multiple heterogeneous networks with a common invariant subspace

Jesús Arroyo, Avanti Athreya|PubMed|Jun 24, 2019

Functional Brain Connectivity Studies62 references42 citations

TL;DR

This paper introduces COSIE, a flexible model for multiple aligned graphs with a shared invariant subspace, and the MASE spectral embedding to jointly estimate common structure and graph-specific parameters, enabling scalable inference across heterogeneous networks.

ABSTRACT

The development of models and methodology for the analysis of data from multiple heterogeneous networks is of importance both in statistical network theory and across a wide spectrum of application domains. Although single-graph analysis is well-studied, multiple graph inference is largely unexplored, in part because of the challenges inherent in appropriately modeling graph differences and yet retaining sufficient model simplicity to render estimation feasible. This paper addresses exactly this gap, by introducing a new model, the common subspace independent-edge multiple random graph model, which describes a heterogeneous collection of networks with a shared latent structure on the vertices but potentially different connectivity patterns for each graph. The model encompasses many popular network representations, including the stochastic blockmodel. The model is both flexible enough to meaningfully account for important graph differences, and tractable enough to allow for accurate inference in multiple networks. In particular, a joint spectral embedding of adjacency matrices-the multiple adjacency spectral embedding-leads to simultaneous consistent estimation of underlying parameters for each graph. Under mild additional assumptions, the estimates satisfy asymptotic normality and yield improvements for graph eigenvalue estimation. In both simulated and real data, the model and the embedding can be deployed for a number of subsequent network inference tasks, including dimensionality reduction, classification, hypothesis testing, and community detection. Specifically, when the embedding is applied to a data set of connectomes constructed through diffusion magnetic resonance imaging, the result is an accurate classification of brain scans by human subject and a meaningful determination of heterogeneity across scans of different individuals.

Motivation & Objective

Motivate and develop a semiparametric model for a collection of aligned graphs that captures shared latent structure while allowing graph-specific heterogeneity.
Introduce a practical, scalable estimation procedure that leverages a common subspace across graphs to estimate both shared and graph-specific parameters.
Demonstrate theoretical properties of the estimator, including consistency and asymptotic normality under mild sparsity assumptions.
Show how the COSIE framework supports downstream tasks such as dimensionality reduction, classification, hypothesis testing, and community detection.
Validate the approach on simulated data and real connectome data to illustrate gains in inference and classification performance.

Proposed method

Define the COSIE model where each graph has expectation V R^(i) V^T with a common orthonormal subspace V and graph-specific score matrices R^(i).
Use adjacency spectral embedding to obtain \\hat{X}^(i) and construct the multiple adjacency spectral embedding (MASE) by stacking vertex embeddings and extracting the top d left singular vectors to estimate V.
Estimate R^(i) by least squares given the estimated V as R^(i) = V^T A^(i) V.
Prove consistency for the common subspace V up to orthogonal transformation and establish asymptotic normality for the R^(i) under delocalization and mild sparsity conditions.
Provide a theoretical bound showing E[min_W in O_d ||\\hat{V} - V W||_F] <= C(1/sqrt(nm) + 1/n) and discuss bias terms decaying with m.
Discuss identifiability results and how R^(i) and V are identifiable up to orthogonal transformations, with implications for downstream inference.

Experimental results

Research questions

RQ1Can a simple, flexible multiple-graph model approximate real-world heterogeneous networks while remaining tractable?
RQ2How can shared subspace structure be leveraged to estimate graph-specific parameters across multiple graphs?
RQ3What are the statistical properties (consistency, asymptotic normality) of the proposed estimators under mild sparsity?
RQ4How can the COSIE–MASE framework be used for tasks like community detection, classification, and hypothesis testing on collections of graphs?
RQ5How does the approach perform on simulated data and real connectome datasets compared to existing methods?

Key findings

MASE yields consistent estimates of the common subspace V and the graph-specific score matrices R^(i).
Under mild sparsity and delocalization, R^(i) estimates are asymptotically normal after Procrustes alignment.
The method provides finite-sample bounds showing the subspace estimation error decreases with both the number of graphs m and graph size n.
COSIE generalizes several existing models, including stochastic blockmodels, and reduces dimensionality from O(mn^2) to O(nd + md^2).
MASE enables effective downstream tasks such as dimensionality reduction, classification, hypothesis testing, and community detection across multiple graphs.
Empirical results on real connectome data demonstrate accurate subject classification and meaningful heterogeneity assessment across subjects.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.