QUICK REVIEW

[论文解读] Multiple Physics Pretraining for Physical Surrogate Models

Michael T. McCabe, Bruno Régaldo-Saint Blancard|arXiv (Cornell University)|Oct 4, 2023

Model Reduction and Neural Networks被引用 16

一句话总结

MPP 通过将多种异质物理系统嵌入共享空间，训练单一 transformer 自回归地建模其动力学，在基于物理的代理模型的预训练中实现具竞争力的表现，并且在低数据转移方面具有优越的性能。对新物理进行微调的预训练 MPP 模型在多步预测方面优于从零开始训练或使用视频基础模型。

ABSTRACT

We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling of spatiotemporal systems with transformers. In MPP, rather than training one model on a specific physical system, we train a backbone model to predict the dynamics of multiple heterogeneous physical systems simultaneously in order to learn features that are broadly useful across systems and facilitate transfer. In order to learn effectively in this setting, we introduce a shared embedding and normalization strategy that projects the fields of multiple systems into a shared embedding space. We validate the efficacy of our approach on both pretraining and downstream tasks over a broad fluid mechanics-oriented benchmark. We show that a single MPP-pretrained transformer is able to match or outperform task-specific baselines on all pretraining sub-tasks without the need for finetuning. For downstream tasks, we demonstrate that finetuning MPP-trained models results in more accurate predictions across multiple time-steps on systems with previously unseen physical components or higher dimensional systems compared to training from scratch or finetuning pretrained video foundation models. We open-source our code and model weights trained at multiple scales for reproducibility.

研究动机与目标

Develop a task-agnostic pretraining framework that learns shared physics representations across heterogeneous systems.
Demonstrate that a single pretrained model can match or exceed task-specific baselines on multiple pretraining sub-tasks.
Show transfer benefits to low-data regimes and downstream tasks beyond autoregressive next-frame prediction.
Evaluate the usefulness of pretrained representations for inverse problems and parametric inference in fluid dynamics.
Release code and pretrained models to support broader adoption in the community.

提出的方法

Embed heterogeneous physical fields into a shared embedding space using reversible instance normalization.
Use an Axial ViT (AViT) transformer backbone with fully axial attention to model spatiotemporal data efficiently.
Train autoregressively to predict the next time step across multiple physics without task-specific finetuning.
Balance multi-task losses with normalized MSE to handle varying scales across systems.
Handle periodic boundaries with modified relative position encodings to preserve locality under wraparound conditions.
Utilize gradient accumulation for stochastic load-balancing across multi-resolution, multi-physics batches.

实验结果

研究问题

RQ1Can large transformer models learn the dynamics of multiple physical systems simultaneously?
RQ2Does multiple physics pretraining provide a finetuning advantage over single-task or video-pretrained baselines for autoregressive prediction on new physics?
RQ3Are pretrained representations useful for downstream tasks beyond next-frame prediction (e.g., inverse problems, parameter estimation, forcing identification)?
RQ4How well do pretrained models transfer to low-data regimes and to systems with different physical regimes (e.g., incompressible vs compressible flows)?

主要发现

模型	参数数量	SWE NRNMSE	DiffRe2D NRNMSE	CNS M1.0 NRNMSE	CNS M0.1 NRNMSE
MPP-AViT-Ti	7.6M	0.0066	0.0168	0.0442	0.0312
UNet	7.7M	0.083-	0.84–	0.4725	1.6650
FNO	466K	0.0044	0.12–	0.1685	0.2425
PINN	8.5K †	0.017-	1.6—	—	—
ORCA-SWIN-B	88M	0.0060	0.82–	—	—
MPP-AViT-B	116M	0.0024	0.0106	0.0281	0.0172
MPP-AViT-S	29M	0.0039	0.0112	0.0319	0.0213
MPP-AViT-L	409M	0.0022	0.0098	0.0208	0.0147

A single MPP-pretrained transformer matches or surpasses task-specific baselines across pretraining sub-tasks without finetuning.
MPP models achieve competitive or superior NRMSE across SWE, DiffRe2D, CNS M1.0, and CNS M0.1 compared to baselines of varying sizes.
In low-data transfer scenarios, MPP outperforms training from scratch and VideoMAE by large margins on incompressible-to-compressible transfer tasks.
Pretrained representations enable useful inverse problems like forcing identification in incompressible NS, with MPP reducing RMSE on forcing estimation.
Scaling MPP (larger AViT variants) yields further reductions in error (e.g., SWE: 0.0066 to 0.0022; DiffRe2D: 0.0168 to 0.0098; CNS M1.0: 0.0442 to 0.0208; CNS M0.1: 0.0312 to 0.0147).
Open-sourced code and pretrained models are provided for community experimentation.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。