QUICK REVIEW

[论文解读] Transformer for Partial Differential Equations' Operator Learning

Zijie Li, Kazem Meidani|arXiv (Cornell University)|May 26, 2022

Model Reduction and Neural Networks被引用 46

一句话总结

论文介绍了 OFormer，一种基于注意力的 Operator Transformer，用于数据驱动学习 PDE 解算子，具备用于离散化不变查询的跨注意力，以及处理时依赖 PDE 的潜在时间推进。

ABSTRACT

Data-driven learning of partial differential equations' solution operators has recently emerged as a promising paradigm for approximating the underlying solutions. The solution operators are usually parameterized by deep learning models that are built upon problem-specific inductive biases. An example is a convolutional or a graph neural network that exploits the local grid structure where functions' values are sampled. The attention mechanism, on the other hand, provides a flexible way to implicitly exploit the patterns within inputs, and furthermore, relationship between arbitrary query locations and inputs. In this work, we present an attention-based framework for data-driven operator learning, which we term Operator Transformer (OFormer). Our framework is built upon self-attention, cross-attention, and a set of point-wise multilayer perceptrons (MLPs), and thus it makes few assumptions on the sampling pattern of the input function or query locations. We show that the proposed framework is competitive on standard benchmark problems and can flexibly be adapted to randomly sampled input.

研究动机与目标

激发数据驱动学习 PDE 解算子，而不依赖特定问题的归纳偏置。
提出一个基于注意力的框架，灵活处理任意输入/查询离散化。
引入潜在时间推进机制，以高效建模时间依赖的 PDE。
使随时间演化的 PDE 能在潜在空间中处理，降低内存和计算需求。

提出的方法

使用自注意力和跨注意力结合逐点 MLP 来学习从输入函数样本到输出函数查询的算子映射。
采用跨注意力机制，使输入网格与任意查询位置解耦。
实现一个潜在时间推进方案，让动力学通过残差 MLP 传播在潜在空间并解码回物理空间。
引入 RoPE 径向位置编码，将相对空间信息注入注意力。
使用逐步损失在离散网格上进行训练，在不同输入/输出离散化之间实现灵活性。

实验结果

研究问题

RQ1基于注意力的架构能否在离散化不变查询下学习 PDE 解算子？
RQ2如何在不进行完整的时空 ujstantiation 的情况下，利用潜在时间推进高效建模时间依赖的 PDE？
RQ3跨注意力是否使得在不依赖输入网格的情况下，在任意输出位置进行灵活查询成为可能？
RQ4径向位置编码对 PDE 算子学习中的空间感知有何影响？
RQ5OFormer 与标准 PDE 基准测试和不规则网格上的最先进算子学习方法相比如何？

主要发现

在标准 PDE 基准上，OFormer 在多种问题上与最先进的算子学习方法相竞争。
模型在输入/输出离散化和网格分辨率的变化下表现稳健。
OFormer 在数据丰富的情形下，针对像 Navier–Stokes 这样的复杂问题表现尤为出色，在参数量少于某些基线的情况下达到可比精度。
跨注意力使在任意坐标查询输出成为可能，与输入网格解耦。
基于 RoPE 的位置编码增强了模型在 PDE 场景中编码空间关系的能力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。