QUICK REVIEW

[论文解读] LieTransformer: Equivariant self-attention for Lie Groups

Michael Hutchinson, Charline Le Lan|arXiv (Cornell University)|Dec 20, 2020

Machine Learning in Materials Science参考文献 58被引用 35

一句话总结

LieTransformer 将自注意力扩展到李群，通过将输入提升为在 G 上的函数并构建对李群及其离散子群保持等变性的 LieSelfAttention 层，在形状计数、QM9 分子性质以及哈密顿动力学任务上实现具有竞争力的结果。

ABSTRACT

Group equivariant neural networks are used as building blocks of group invariant neural networks, which have been shown to improve generalisation performance and data efficiency through principled parameter sharing. Such works have mostly focused on group equivariant convolutions, building on the result that group equivariant linear maps are necessarily convolutions. In this work, we extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models. We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups. We demonstrate the generality of our approach by showing experimental results that are competitive to baseline methods on a wide range of tasks: shape counting on point clouds, molecular property regression and modelling particle trajectories under Hamiltonian dynamics.

研究动机与目标

激发并利用由 Lie 群描述的群对称性，以提高学习效率和泛化能力。
通过基于提升的框架，将自注意力扩展为在李群上保持等变。
在具有旋转/平移对称性的任务上演示该方法：形状计数、分子性质预测和哈密顿动力学。
证明 LieTransformer 能处理李群及其离散子群，并取得具有竞争力的性能。

提出的方法

通过提升算子 L，将同胚空间 X 的输入数据提升为在李群 G 上的函数，以实现 G 等变处理。
使用 LieSelfAttention 在提升域 G 上进行注意力，结合内容与位置信息线索，确保在 G 的正则表示下的等变性。
在 G 或提升域为连续时（无限 G_f 的李群），采用蒙特卡洛采样近似积分。
应用 LieSelfAttention、LayerNorm 与 MLP 的残差块，随后是最终的不变 G 池化层来产生任务输出。
通过证明提升操作是等变的，以及 LieSelfAttention 在正则表示下是等变的，来给出等变性证明。

实验结果

研究问题

RQ1自注意力是否可以对任意李群及其离散子群实现等变？
RQ2基于提升的 LieSelfAttention 架构在需要 SE(2)/SE(3) 或其他李群对称性的任务上是否能获得具有竞争力的性能？
RQ3与非不变基线和 LieConv 相比，LieTransformer 在形状计数、分子性质预测（QM9）及哈密顿动力学任务上的表现如何？

主要发现

LieTransformer 在形状计数、QM9 和哈密顿动力学任务上实现了对强基线的具有竞争力的性能。
LieSelfAttention 在 G 的正则表示下具有可证明的等变性（对于无限 G，使用蒙特卡洛近似）。
SE(3) 不变量变体通常优于仅平移的变体，并且在旋转不变性下显示出更好的泛化性，提升样本在提升采样方面存在一些方差。
在哈密顿动力学中，LieTransformer 相比非不变基线在数据效率与泛化方面显著更好（在某些情形达到1–3 个数量级）。
与 LieConv 相比，在相同模型规模和群下，LieTransformer 往往获得更好的性能，尤其在 T(2) 和 SE(2) 设置下。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。