QUICK REVIEW

[論文レビュー] Geometric Algebra Transformer

Johann Brehmer, Pim de Haan|arXiv (Cornell University)|May 28, 2023

Algebraic and Geometric Analysis被引用数 9

ひとこと要約

GATrは、プロジェクティブ幾何代数表現で動作する一般目的のE(3)準同型変換器で、幾何データを処理する。n体力学、動脈壁せん断ストレス、ロボット計画でデータ効率とスケーラビリティに優れた実証を示す。

ABSTRACT

Problems involving geometric data arise in physics, chemistry, robotics, computer vision, and many other fields. Such data can take numerous forms, for instance points, direction vectors, translations, or rotations, but to date there is no single architecture that can be applied to such a wide variety of geometric types while respecting their symmetries. In this paper we introduce the Geometric Algebra Transformer (GATr), a general-purpose architecture for geometric data. GATr represents inputs, outputs, and hidden states in the projective geometric (or Clifford) algebra, which offers an efficient 16-dimensional vector-space representation of common geometric objects as well as operators acting on them. GATr is equivariant with respect to E(3), the symmetry group of 3D Euclidean space. As a Transformer, GATr is versatile, efficient, and scalable. We demonstrate GATr in problems from n-body modeling to wall-shear-stress estimation on large arterial meshes to robotic motion planning. GATr consistently outperforms both non-geometric and equivariant baselines in terms of error, data efficiency, and scalability.

研究の動機と目的

3D 空間対称性を尊重する汎用アーキテクチャを提供する。
入力、出力、および隠れ状態をG3,0,1の多ベクトルとして表現し、幾何オブジェクトと変換をエンコードする。
Transformer フレームワーク内でE(3)準同型ニューラルネットワークプリミティブ（線形写像、アテンション、非線形性、正規化）を開発する。
多様な幾何タスクで GATr を示し、幾何非依存および等変性基準と比較して性能、データ効率、スケーラビリティを評価する。

提案手法

データをプロジェクティブ幾何代数 G3,0,1 の多ベクトルとして表現し、点・直線・平面・変換をエンコードする。
線形写像、幾何二次層（幾何積と連結）、非線形性、内積に基づくマルチベクトルのアテンション機構を含むE(3)準同型レイヤを構築する。
マルチベクトル表現に適合したドット積アテンションを備えた Transformer ボトンを使用し、非幾何情報のための補助スカラー経路を統合する。
シーケンス様データの距離認識強化をアテンションに組み込み、ロータリーポジショニングエンベディングを導入する。複数の問題軸に沿った軸アテンションをサポートする。

Figure 1: Overview over the GATr architecture. Boxes with solid lines are learnable components, those with dashed lines are fixed.

実験結果

リサーチクエスチョン

RQ1GATr は G3,0,1 における幾何変換と関係を直接学習しつつ、E(3) に対して準同型を維持できるか？
RQ2非幾何および既存の等変基準と比較して、さまざまな幾何タスクでのパフォーマンスとスケールはどうか？
RQ3幾何二項と双操作を導入することは、純粋に等変線形層だけより表現力を向上させるか？
RQ4補助スカラー表現と距離認識アテンションが性能とデータ効率に与える影響は？

主な発見

GATr は n-body ダイナミクス、動脈メッシュの壁せん断応力推定、ロボット計画といったタスクで非幾何基準を上回る。
複数の設定で競合する等変基準の基準よりデータ効率が高く、大規模問題サイズへもスケールする。
幾何代数表現と E(3) 準同型性の組み合わせにより、翻訳などのドメインシフト下での堅牢な一般化を実現する。
距離認識アテンションと補助スカラーチャネルは、等変性を損なうことなく性能を向上させる。
GATr は標準的な Transformer に匹敵するスケーラブルなメモリと計算特性を示し、いくつかの等変基準より大規張トークン問題で優れる。

Figure 2: $n$ -body dynamics experiments. We show the error in predicting future positions of planets as a function of the training dataset size. Out of five independent training runs, the mean and standard error are shown. Left : Evaluating without distribution shift. GATr ( )is more sample efficie

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。