QUICK REVIEW

[Paper Review] Learning Neural PDE Solvers with Convergence Guarantees

Jun-Ting Hsieh, Shengjia Zhao|arXiv (Cornell University)|Jun 4, 2019

Model Reduction and Neural Networks56 citations

TL;DR

The paper trains a neural network to modify updates of an existing linear PDE solver, preserving convergence guarantees while achieving faster convergence and generalization across geometries and boundary conditions.

ABSTRACT

Partial differential equations (PDEs) are widely used across the physical and computational sciences. Decades of research and engineering went into designing fast iterative solution methods. Existing solvers are general purpose, but may be sub-optimal for specific classes of problems. In contrast to existing hand-crafted solutions, we propose an approach to learn a fast iterative solver tailored to a specific domain. We achieve this goal by learning to modify the updates of an existing solver using a deep neural network. Crucially, our approach is proven to preserve strong correctness and convergence guarantees. After training on a single geometry, our model generalizes to a wide variety of geometries and boundary conditions, and achieves 2-3 times speedup compared to state-of-the-art solvers.

Motivation & Objective

Motivate faster, domain-specific PDE solvers by learning updates to existing linear solvers while preserving correctness.
Guarantee convergence to the true PDE solution through fixed-point preservation.
Demonstrate generalization to unseen geometries, boundary conditions, and grid sizes despite training on a single instance.

Proposed method

Represent the learned solver as a parametric update to a baseline iterator: u' = Ψ(u; G, f, b, n) + G H (Ψ(u; G, f, b, n) - u), where H is a learned linear operator implemented as a convolutional network.
Use a linear (Jacobi-like) baseline Ψ and ensure fixed points remain solutions by design (Proposition 1).
Parameterize H with linear deep networks (Conv or U-Net architectures) to approximate T(I−T)−1, accelerating convergence (Theorem 2 and interpretation in Section 3.3).
Train on a single geometry/problem instance yet evaluate generalization across different geometries and boundary conditions (Proposition 2).
Provide two model families: Conv models (3×3 convolutions) and U-Net-based Multigrid models to capture local and multi-scale corrections.

Experimental results

Research questions

RQ1Can a learned linear correction to a standard iterative PDE solver preserve convergence to the correct fixed point?
RQ2Does the learned correction accelerate convergence across varying geometries, boundary conditions, and grid sizes not seen during training?
RQ3How well do convolutional and U‑Net based linear networks approximate the optimal operator to accelerate convergence?
RQ4What theoretical guarantees exist for the learned solver's convergence and generalization properties?

Key findings

The learned iterator Φ_H preserves the fixed point of the baseline Ψ, ensuring correctness (Lemma 1).
The spectral norm of Φ_H is a convex function of H, with the set of H yielding ρ(Φ_H) < 1 being a convex open set (Theorem 2).
Training on a single domain can yield convergence and speedups on unseen geometries and grid sizes (Proposition 2).
Empirical results show significant speedups over Jacobi and Multigrid baselines across square, L-shape, and cylinders domains, and square-Poisson with f ≠ 0.
On CPU, the Conv3 model achieves 0.219–0.220 times the layers/ops and 0.424–0.426 times the multiply-adds relative to Jacobi across settings; Conv3 is about 5× faster in layers and 2.5× faster in ops.
U‑Net models outperform the corresponding Multigrid baselines in all tested settings, with additional GPU acceleration yielding up to ~30× speedups over CPU baselines.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.