QUICK REVIEW

[論文レビュー] Scalable Transformer for PDE Surrogate Modeling

Zijie Li, Dule Shu|arXiv (Cornell University)|May 27, 2023

Model Reduction and Neural Networks被引用数 20

ひとこと要約

論文はFactFormerを導入し、PDE代理モデルのためのスケーラブルな軸方向分解トランスフォーマーで、高次元グリッド（2D/3D流体問題）で計算を削減しつつ競争力のある精度を達成。

ABSTRACT

Transformer has shown state-of-the-art performance on various applications and has recently emerged as a promising tool for surrogate modeling of partial differential equations (PDEs). Despite the introduction of linear-complexity attention, applying Transformer to problems with a large number of grid points can be numerically unstable and computationally expensive. In this work, we propose Factorized Transformer (FactFormer), which is based on an axial factorized kernel integral. Concretely, we introduce a learnable projection operator that decomposes the input function into multiple sub-functions with one-dimensional domain. These sub-functions are then evaluated and used to compute the instance-based kernel with an axial factorized scheme. We showcase that the proposed model is able to simulate 2D Kolmogorov flow on a $256 imes 256$ grid and 3D smoke buoyancy on a $64 imes64 imes64$ grid with good accuracy and efficiency. The proposed factorized scheme can serve as a computationally efficient low-rank surrogate for the full attention scheme when dealing with multi-dimensional problems.

研究の動機と目的

高解像度グリッドでのPDEのスケーラブルな代理モデル構築を、トランスフォーマーで推進する。
多次元ドメインを扱うための因子化された、softmax非採用のアテンション機構を提案する。
計算コストを削減しつつ精度を維持する、学習可能な射影と軸方向カーネル積分を開発する。
難解なPDE問題（2D Kolmogorov flow、3D smoke buoyancy、3D isotropic turbulence、2D Darcy flow）で評価する。

提案手法

RoPEに基づく相対配置を用いた、学習可能なカーネル積分としてアテンションを再記述する。
高次元入力を複数の1次元ドメインサブ関数に写像する学習可能な射影演算子を導入する。
軸方向ごとにカーネルを適用し、次元全体でテンソル-マトリックス積を行う因子化カーネル積分を定義する。
潜在的なマーチングとプッシュフォワード技術を用いて、自己回帰的時間予測の安定性を高める。
インスタンス正規化とアテンション後のMLPを用いて最終更新を行う。

Figure 1 : Model’s prediction (pred.) and reference ground truth (ref.). Left : 2D Kolmogorov flow on $256\times 256$ grid; Right : 3D smoke buoyancy on $64\times 64\times 64$ grid ( $zOy$ cross-section is shown).

実験結果

リサーチクエスチョン

RQ1因子化された軸方向アテンション方式は、高解像度グリッドでスケーラブルで安定したPDE代理モデルづくりを提供できるか。
RQ2FactFormerは、マルチディメンショナル問題で効率性を向上させつつ、最先端のニューラル演算子（FNOの派生、Dil-ResNet）と同等の精度を維持するか。
RQ3潜在的マーチングとプッシュフォワードのような訓練戦略は、安定性と長期予測精度にどのように影響するか。
RQ4この手法は2Dおよび3Dの流体力学ベンチマーク（Kolmogorov flow、isotropic turbulence、smoke buoyancy、Darcy flow）でどのように性能を発揮するか。

主な発見

FactFormerは、2D Kolmogorov flowおよび3D smoke buoyancyで良好な効率と競争力のある精度を達成。
3D isotropic turbulenceでは、短期のフレームごとの精度でDil-ResNetが上回ることがあるが、FactFormerは長期的な視点で安定かつ競争力が高い。
FactFormerは、分解能の異なる場合でも比較的安定した性能を維持する。高次の離散化で誤差が増えるCNNベースのモデルとは異なる。
潜在マーチングとプッシュフォワード訓練は、自己回帰訓練と比較して安定性を高め、 rollout誤差を低減。
全体として、FactFormerはFNO派生およびDil-ResNetと比較して、精度、推論時間、パラメータ数の間で有利なトレードオフを提供する。

Figure 2 : Schematic of the factorized kernel attention. Upper path : the input is transformed into the Value via a linear transformation. Lower path : the input is first projected into multiple sub-functions with a one-dimensional domain. These sub-functions are then used to derive the Query and Ke

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。