QUICK REVIEW

[논문 리뷰] Poseidon: Efficient Foundation Models for PDEs

Maximilian Herde, Bogdan Raonić|arXiv (Cornell University)|2024. 05. 29.

Numerical methods for differential equations인용 수 9

한 줄 요약

Poseidon은 유체역학 데이터로 사전학습된 PDE 솔루션 연산자 학습을 위한 기초 모델로, 샘플 효율성과 unseen 물리로의 일반화가 높은 강한 다운스트림 성능을 달성합니다.

ABSTRACT

We introduce Poseidon, a foundation model for learning the solution operators of PDEs. It is based on a multiscale operator transformer, with time-conditioned layer norms that enable continuous-in-time evaluations. A novel training strategy leveraging the semi-group property of time-dependent PDEs to allow for significant scaling-up of the training data is also proposed. Poseidon is pretrained on a diverse, large scale dataset for the governing equations of fluid dynamics. It is then evaluated on a suite of 15 challenging downstream tasks that include a wide variety of PDE types and operators. We show that Poseidon exhibits excellent performance across the board by outperforming baselines significantly, both in terms of sample efficiency and accuracy. Poseidon also generalizes very well to new physics that is not seen during pretraining. Moreover, Poseidon scales with respect to model and data size, both for pretraining and for downstream tasks. Taken together, our results showcase the surprising ability of Poseidon to learn effective representations from a very small set of PDEs during pretraining in order to generalize well to unseen and unrelated PDEs downstream, demonstrating its potential as an effective, general purpose PDE foundation model. Finally, the Poseidon model as well as underlying pretraining and downstream datasets are open sourced, with code being available at https://github.com/camlab-ethz/poseidon and pretrained models and datasets at https://huggingface.co/camlab-ethz.

연구 동기 및 목표

PDE에서 기초 모델의 필요성을 제시하여 작업별 신경 연산자보다 샘플 효율성을 향상시키는 것을 목표로 한다.
Poseidon을 소개하며, PDE 해결 연산자에 맞춘 확장 가능한 기초 모델 아키텍처.
다양한 PDE 데이터에 대한 선학습이 미지의 PDE 및 물리로의 강한 일반화를 가능하게 함을 보여준다.
Poseidon의 모델 크기와 데이터 크기가 확장될 수 있음을 보여주고 오픈 소스 데이터셋과 코드를 제공한다.

제안 방법

lead-time conditioning이 있는 계층적 다중스케일 비전 트랜스포머 scOT를 사용하여 PDE 솔루션 연산자 S(t,a)을 근사한다.
연속 시간 평가를 가능하게 하기 위해 시간 조건화된 계층 정규화를 도입한다.
시간 의존 PDE의 반군적 특성을 활용하여 trajectories에서 더 많은 학습 쌍을 생성하는 all2all 학습 전략을 적용한다.
Poseidon을 대규모 다양 데이터 세트의 Euler 및 Navier–Stokes 연산자에서 사전학습한 뒤 다운스트림 작업에서 미세조정한다.
다양한 15개 PDE 작업에서 최종 시간의 상대 L1 오차로 평가하되, out-of-distribution 사례 포함.

실험 결과

연구 질문

RQ1PDE 기초 모델이 소수의 PDE 집합에서 사전학습되어 보지 않은 PDE 및 물리에 일반화되는 표현을 배울 수 있는가?
RQ2아키텍처, 데이터 크기, 모델 크기가 다운스트림 성능과 샘플 효율성에 어떤 영향을 미치는가?
RQ3all2all 학습을 통해 semigroup 속성을 활용하면 PDE 연산자 학습의 데이터 효율성이 향상되는가?
RQ4Poseidon이 long-time limit로 해석하여 시간 독립 PDE로 얼마나 전이가 가능한가?
RQ5다양한 다운스트림 작업에서 Poseidon은 작업별 신경 연산자 및 다른 PDE 기초 모델과 어떻게 비교되는가?

주요 결과

Poseidon은 정확도와 샘플 효율성 측면에서 15개의 다운스트림 작업 모두에서 baselines를 능가한다.
평균적으로 Poseidon은 시간 의존 PDE의 경우 1024 샘플의 FNO의 오차를 맞추기 위해 약 20개의 작업 특이 샘플이 필요하고(시간 독립은 4096)이다.
Poseidon은 사전학습에 포함되지 않은 작업을 포함한 보이지 않는 PDE 및 물리에도 일반화가 잘 되며, 다운스트림 샘플이 몇 개만 있어도 된다.
모델 크기와 데이터 세트 크기 모두 다운스트림 작업 전반에 걸쳐 성능과 샘플 효율성에 긍정적 영향을 준다.
사전학습 다양성(데이터 품질과 다양성)이 대부분의 작업에서 다운스트림 정확도에 큰 영향을 미친다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.