[Paper Review] Optimal Transport for structured data with application on graphs
Introduces Fused Gromov-Wasserstein (FGW) distance to compare structured data (graphs) by jointly accounting for features and structure, and demonstrates state-of-the-art graph classification and barycenter computation. It unifies Wasserstein and Gromov-Wasserstein into a single framework.
This work considers the problem of computing distances between structured objects such as undirected graphs, seen as probability distributions in a specific metric space. We consider a new transportation distance (i.e. that minimizes a total cost of transporting probability masses) that unveils the geometric nature of the structured objects space. Unlike Wasserstein or Gromov-Wasserstein metrics that focus solely and respectively on features (by considering a metric in the feature space) or structure (by seeing structure as a metric space), our new distance exploits jointly both information, and is consequently called Fused Gromov-Wasserstein (FGW). After discussing its properties and computational aspects, we show results on a graph classification task, where our method outperforms both graph kernels and deep graph convolutional networks. Exploiting further on the metric properties of FGW, interesting geometric objects such as Fréchet means or barycenters of graphs are illustrated and discussed in a clustering context.
Motivation & Objective
- Motivate and formalize the problem of comparing structured data (e.g., graphs) as probability measures over feature/structure space.
- Propose the FGW distance that fuses feature similarity and structural similarity in optimal transport.
- Develop algorithms to compute FGW (CG for q=2, line-search, BCD for barycenters) and analyze their properties.
- Demonstrate FGW on graph classification benchmarks and unsupervised graph clustering/barycenter applications.
Proposed method
- Represent graphs as probability measures on the product space of features and structure: μ = sum h_i δ_(x_i,a_i).
- Define a couplings set Π(h,g) matching graph histograms while transporting mass between graphs.
- Introduce the FGW cost E_q combining feature transport cost and intra-/inter-structure distances via a trade-off α.
- Prove FGW interpolates between Wasserstein (α→0) and Gromov-Wasserstein (α→1) and is a metric (q=1) or semi-metric (q>1).
- Provide optimization procedures: (i) CG algorithm for q=2 with gradient and OT subproblem, (ii) line-search for step size, (iii) BCD for FGW barycenters with closed-form updates for C and A.
- Discuss scalability notes and potential extensions to deep learning and larger graphs.
Experimental results
Research questions
- RQ1Can a distance between structured objects (graphs) that jointly accounts for features and structure be defined via optimal transport?
- RQ2How does FGW relate to and generalize Wasserstein and Gromov-Wasserstein distances?
- RQ3Can FGW be efficiently computed for discrete graphs, and can it be used for effective graph classification and clustering?
- RQ4What are the geometric objects derived from FGW (e.g., barycenters) and how can they be used in clustering and analysis?
Key findings
- FGW provides a distance between structured data that converges to Wasserstein in the feature limit and to Gromov-Wasserstein in the structural limit (α interpolation).
- FGW achieves state-of-the-art or competitive accuracy on multiple graph classification benchmarks across vector-valued and discrete attributes, often outperforming kernels and some deep methods.
- FGW supports meaningful graph barycenters (Fréchet means) enabling clustering and revealing representative graphs for clusters.
- FGW is a metric for q=1 (under certain conditions) and a semi-metric for q>1, enabling geometric interpretations (e.g., geodesics, barycenters).
- The paper demonstrates FGW’s utility for both supervised (classification) and unsupervised (clustering/barycenter) tasks on graphs.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.