[论文解读] Transolver: A Fast Transformer Solver for PDEs on General Geometries
Transolver 引入 Physics-Attention,学习离散域的固有物理状态,实现对可学习切片的线性时间注意力,并在多样几何和大规模工业任务上实现最先进的 PDE 求解。
Transformers have empowered many milestones across various fields and have recently been applied to solve partial differential equations (PDEs). However, since PDEs are typically discretized into large-scale meshes with complex geometries, it is challenging for Transformers to capture intricate physical correlations directly from massive individual points. Going beyond superficial and unwieldy meshes, we present Transolver based on a more foundational idea, which is learning intrinsic physical states hidden behind discretized geometries. Specifically, we propose a new Physics-Attention to adaptively split the discretized domain into a series of learnable slices of flexible shapes, where mesh points under similar physical states will be ascribed to the same slice. By calculating attention to physics-aware tokens encoded from slices, Transovler can effectively capture intricate physical correlations under complex geometrics, which also empowers the solver with endogenetic geometry-general modeling capacity and can be efficiently computed in linear complexity. Transolver achieves consistent state-of-the-art with 22% relative gain across six standard benchmarks and also excels in large-scale industrial simulations, including car and airfoil designs. Code is available at https://github.com/thuml/Transolver.
研究动机与目标
- 用基于 Transformer 的模型解决复杂、非规则几何中的 PDE 的动机。
- 开发一个 PDE 求解器,通过学习与物理相关的切片来绕过在大规模网格上的逐点注意力。
- 实现线性时间复杂度并在多样的 2D/3D 基准和工业设计中具有可扩展的性能。
提出的方法
- 引入 Physics-Attention,将离散域基于网格特征分解成可学习的切片。
- 将每个切片编码为表示物理状态的物理感知 token,然后对 token 之间应用注意力。
- 通过 deslicing 将 token 输出聚合回网格点,获得下一个特征表示。
- 证明 Physics-Attention 能在域 Ω 上近似一个可学习的积分算子。
- 采用类似 Transformer 的架构,在 L 层中用 Physics-Attention 取代标准注意力。
实验结果
研究问题
- RQ1通过切片为基础的 token 学习内在物理状态,能否提升对不规则几何的 PDE 的相关性建模?
- RQ2Physics-Attention 是否在网格大小方面实现线性计算复杂度,同时保持或提高准确性?
- RQ3与现有神经算子和 Transformers 相比,Transolver 在标准 PDE 基准和实际设计任务中的表现如何?
主要发现
- Transolver 在六个标准基准上持续提供最先进的性能,并实现显著的相对提升。
- 该模型在大规模工业仿真(汽车和翼型设计)上取得具有竞争力或更优的结果。
- 消融研究表明,增加切片数量会提高准确性,但会增加计算量,而固定的规则网格不如可学习切片。
- 效率分析显示 Transolver 使用更少的参数,在相似或更好精度下比若干基线具有更快的运行时间。
- Physics-Attention 可视化显示切片与物理上连贯的区域对齐,且 token-token 的注意力比网格点注意力更加尖锐。
- 当网格损坏或部分观测时,该方法仍保持性能,展示对几何不规则性的鲁棒性。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。