QUICK REVIEW

[論文レビュー] Poseidon: Efficient Foundation Models for PDEs

Maximilian Herde, Bogdan Raonić|arXiv (Cornell University)|May 29, 2024

Numerical methods for differential equations被引用数 9

ひとこと要約

Poseidon は PDE 解法演算子を学習する基盤モデルで、流体力学データで事前学習され、サンプル効率が高く、見たことのない物理への一般化が良好な強力な下流性能を達成する。

ABSTRACT

We introduce Poseidon, a foundation model for learning the solution operators of PDEs. It is based on a multiscale operator transformer, with time-conditioned layer norms that enable continuous-in-time evaluations. A novel training strategy leveraging the semi-group property of time-dependent PDEs to allow for significant scaling-up of the training data is also proposed. Poseidon is pretrained on a diverse, large scale dataset for the governing equations of fluid dynamics. It is then evaluated on a suite of 15 challenging downstream tasks that include a wide variety of PDE types and operators. We show that Poseidon exhibits excellent performance across the board by outperforming baselines significantly, both in terms of sample efficiency and accuracy. Poseidon also generalizes very well to new physics that is not seen during pretraining. Moreover, Poseidon scales with respect to model and data size, both for pretraining and for downstream tasks. Taken together, our results showcase the surprising ability of Poseidon to learn effective representations from a very small set of PDEs during pretraining in order to generalize well to unseen and unrelated PDEs downstream, demonstrating its potential as an effective, general purpose PDE foundation model. Finally, the Poseidon model as well as underlying pretraining and downstream datasets are open sourced, with code being available at https://github.com/camlab-ethz/poseidon and pretrained models and datasets at https://huggingface.co/camlab-ethz.

研究の動機と目的

PDEにおける基盤モデルの必要性を動機づけ、タスク特化型のニューラル演算子よりもサンプル効率を改善する。
PDE解法演算子に特化した、スケーラブルな基盤モデルアーキテクチャ Poseidon を紹介する。
多様なPDEデータでの事前学習が、未見のPDEや物理現象への強い一般化を可能にすることを示す。
Poseidon がモデルサイズとデータサイズに応じてスケールすることを示し、オープンソースのデータセットとコードを提供する。

提案手法

scOT は lead-time conditioning を備えた階層型マルチスケールビジョントランスフォーマーであり、PDE解法演算子 S(t,a) を近似する。
時間条件付きのレイヤー正規化を組み込んで、時間連続評価を可能にする。
時間依存PDEの半群性を利用して軌跡からより多くの訓練ペアを生成する all2all トレーニング戦略を適用する。
オイラー・ Navier–Stokes 演算子の大規模で多様なデータセットを用いて Poseidon を事前学習し、下流タスクでファインチューニングする。
最終時刻における相対L1誤差を用いて、分布外ケースを含む15の多様なPDEタスクに対して評価する。

実験結果

リサーチクエスチョン

RQ1小規模なPDE集合で事前学習したPDE基盤モデルが、未見のPDEや物理へ一般化する表現を学べるか。
RQ2アーキテクチャ、データサイズ、モデルサイズは下流性能とサンプル効率にどのように影響するか。
RQ3all2all トレーニングを介して半群性を活用することが、PDE演算子のデータ効率を改善するか。
RQ4Poseidon は長時間極限として解釈することで時間非依存PDEへどの程度転移できるか。
RQ5Poseidon はタスク特化型ニューロン演算子や他のPDE基盤モデルと、さまざまな下流タスクでどのように比較されるか。

主な発見

Poseidon は下流の全15タスクで、精度とサンプル効率の両面でベースラインを上回る。
平均すると、Poseidon は時系列依存PDE で 1024 サンプルの FNO の誤差に匹敵するのに約 20 のタスク特化サンプルを、時系列非依存PDE では 4096 を要する。
Poseidon は事前学習に含まれないタスクや物理を含む未見PDEへも、少数の下流サンプルだけで良く一般化する。
モデルサイズとデータセットサイズは、下流タスク全般で性能とサンプル効率の向上に正の影響を与える。
事前学習の多様性（データの質と多様性）は、ほとんどのタスクで下流の正確さに大きく影響する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。