QUICK REVIEW

[Paper Review] Deep Multiscale Model Learning

Yating Wang, Siu Wun Cheung|arXiv (Cornell University)|Jun 13, 2018

Model Reduction and Neural Networks5 references18 citations

TL;DR

This paper proposes Deep Multiscale Model Learning (DMML), a novel framework that integrates non-local multi-continuum (NLMC) multiscale model reduction with deep neural networks to construct data-conditioned coarse-grid models for nonlinear PDEs. By leveraging multiscale concepts to define physical degrees of freedom and connectivity, and training the resulting network using hybrid observation and computational data, DMML achieves accurate, physics-informed predictions even with limited observational data, significantly outperforming purely data-driven or simulation-only approaches in numerical tests.

ABSTRACT

The objective of this paper is to design novel multi-layer neural network architectures for multiscale simulations of flows taking into account the observed data and physical modeling concepts. Our approaches use deep learning concepts combined with local multiscale model reduction methodologies to predict flow dynamics. Using reduced-order model concepts is important for constructing robust deep learning architectures since the reduced-order models provide fewer degrees of freedom. Flow dynamics can be thought of as multi-layer networks. More precisely, the solution (e.g., pressures and saturations) at the time instant $n+1$ depends on the solution at the time instant $n$ and input parameters, such as permeability fields, forcing terms, and initial conditions. One can regard the solution as a multi-layer network, where each layer, in general, is a nonlinear forward map and the number of layers relates to the internal time steps. We will rely on rigorous model reduction concepts to define unknowns and connections for each layer. In each layer, our reduced-order models will provide a forward map, which will be modified ("trained") using available data. It is critical to use reduced-order models for this purpose, which will identify the regions of influence and the appropriate number of variables. Because of the lack of available data, the training will be supplemented with computational data as needed and the interpolation between data-rich and data-deficient models. We will also use deep learning algorithms to train the elements of the reduced model discrete system. We will present main ingredients of our approach and numerical results. Numerical results show that using deep learning and multiscale models, we can improve the forward models, which are conditioned to the available data.

Motivation & Objective

To develop a robust framework for constructing coarse-grid models of nonlinear multiscale PDEs that incorporate observed data and physical constraints.
To address the challenge of limited observational data in multiscale simulations by fusing it with computational data through deep learning.
To improve the accuracy and generalization of upscaled models by embedding multiscale model reduction concepts into deep neural network architectures.
To demonstrate that multiscale model reduction can guide network architecture design, ensuring physical interpretability and reduced degrees of freedom.
To validate that hybrid training with both observational and computational data enhances model performance when observational data is sparse.

Proposed method

The method uses the non-local multi-continuum (NLMC) approach as the foundation for defining coarse-grid degrees of freedom with physical meaning, such as solution averages.
Each time step in the simulation is modeled as a layer in a deep neural network, with the solution at time $n+1$ predicted as a nonlinear map from the solution at time $n$ and input parameters.
The network architecture is informed by multiscale model reduction, ensuring that the connections and unknowns reflect physical influence regions and local heterogeneities.
A hybrid training strategy combines observational data with computational data from simulations with different permeability fields to improve generalization and data efficiency.
The network is trained using a loss function that minimizes the difference between predicted and observed final-time solutions, with separate networks trained on pure observation data, pure simulation data, and mixed data.
The framework uses a multi-layer structure where each layer corresponds to a time step, and the forward map is learned via deep learning while preserving the multiscale structure.

Experimental results

Research questions

RQ1Can deep learning be effectively combined with multiscale model reduction to create accurate, data-conditioned coarse-grid models for nonlinear PDEs?
RQ2How does the inclusion of both observational and computational data affect the performance and generalization of the learned model?
RQ3Can multiscale concepts such as influence regions and reduced degrees of freedom improve the architecture and training of deep neural networks in multiscale simulations?
RQ4What is the impact of data scarcity on model accuracy, and can hybrid data training mitigate this issue?
RQ5To what extent can the learned deep network preserve physical consistency while achieving high predictive accuracy?

Key findings

In Example 1, the mixture-data-driven network achieved a mean error of 5.6% on final-time predictions, outperforming the pure-simulation network (52.3% error), demonstrating the benefit of incorporating observational data.
In Example 2, the mixture-data-driven network achieved a mean error of 8.5%, while the pure-observation network achieved 2.6%, showing that even limited observational data can significantly improve accuracy.
In Example 3, where permeability contrast was high and data were highly dissimilar, the mixture network (8.8% mean error) significantly outperformed the pure-simulation network (64.3% error), proving the value of hybrid data training.
The network trained on mixed data ($\mathcal{N}_m$) produced predictions that were much closer to the observed data than the pure-simulation network, with mean error 8.8% compared to 64.3%.
The solution comparison in Figure 10 confirmed that the mixture-trained network ($\mathcal{N}_m$) produced reliable and physically plausible predictions across testing samples.
The results show that using coarse degrees of freedom derived from multiscale model reduction improves model robustness and reduces overfitting, especially in data-scarce regimes.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.