QUICK REVIEW

[논문 리뷰] Versor: A Geometric Sequence Architecture

Truong Minh Huy, Edward Hirst|arXiv (Cornell University)|2026. 02. 10.

Algebraic and Geometric Analysis인용 수 0

한 줄 요약

Versor는 CGA 기반 시퀀스 아키텍처를 도입하여 Geometric Product Attention과 Recursive Rotor Accumulator를 활용함으로써 규모 일반화 가능하고 해석 가능하며 하드웨어에 효율적인 시퀀스 모델링을 달성하고, 여러 태스크에서 Transformers를 능가한다.

ABSTRACT

A novel sequence architecture is introduced, Versor, which uses Conformal Geometric Algebra (CGA) in place of traditional linear operations to achieve structural generalization and significant performance improvements on a variety of tasks, while offering improved interpretability and efficiency. By embedding states in the $Cl_{4,1}$ manifold and evolving them via geometric transformations (rotors), Versor natively represents $SE(3)$-equivariant relationships without requiring explicit structural encoding. Versor is validated on chaotic N-body dynamics, topological reasoning, and standard multimodal benchmarks (CIFAR-10, WikiText-103), consistently outperforming Transformers, Graph Networks, and geometric baselines (GATr, EGNN). Key results include: orders-of-magnitude fewer parameters ($200 imes$ vs. Transformers); interpretable attention decomposing into proximity and orientational components; zero-shot scale generalization (0.993 vs. 0.070 MCC for ViT); and featuring a Recursive Rotor Accumulator (RRA) for $O(L)$ linear temporal complexity in dynamical systems, and a Geometric Product Attention (GPA) mechanism for $O(L^{2})$ global relational modeling, allowing for task-specific architectural pruning or hybridization depending on the required scale. In out-of-distribution tests, Versor maintains stable predictions while Transformers fail catastrophically. Custom Clifford kernels achieve a cumulative over $100 imes$ speedup via bit-masked contraction and specialized Matrix Isomorphism kernels, reducing per-step latency to 1.05 ms and outperforming highly-optimized Transformer baselines.

연구 동기 및 목표

연속 모델에 대칭 임베딩 선행을 직접 동기화하여 “Euclidean Bottleneck”를 극복하도록 동기를 부여한다.
Cl4,1에서 작동하는 CGA 기반 시퀀스 아키텍처를 제안하여 SE(3)-등각성 관계를 모델링한다.
표준 Transformer 및 기하학적 기준선보다 규모 일반화, 해석 가능성 및 효율성을 입증한다.
혼돈 동역학, 위상학, 비전 및 언어 작업 전반에 걸친 멀티모달 기능을 선보인다.

제안 방법

Geometric Product Attention (GPA)를 도입하여 attention을 스칼라(근접) 및 바이벡터(방향) 성분으로 분해한다.
Spin(4,1) 매니폴드에서 상태 진화를 갖는 O(L) 시간 복잡도를 달성하기 위해 Recursive Rotor Accumulator (RRA)를 개발한다.
drift 방지 및 안정적인 장기 시퀀스 다이나믹스를 가능하게 하기 위해 Manifold Normalization으로 매니폴드 제약을 강제한다.
가속된 Clifford 곱 연산을 위해 하드웨어 최적화된 Clifford 커널(bit-masked 및 matrix isomorphism)을 활용한다.
차원에 적응된 Clifford 대수 및 향후 GAPU 하드웨어 제안의 가능성을 가진 소프트웨어 레이아웃(gacore)을 제공한다.

Figure 1 : The Versor Architecture. (Left) Geometric Product Attention (GPA). (Right) The Recursive Rotor Accumulator (RRA).

실험 결과

연구 질문

RQ1Conformal Geometric Algebra가 명시적 구조 인코딩 없이 SE(3)-등각 시퀀스 모델링을 가능하게 할 수 있는가?
RQ2CGA 기반 아키텍처가 규모와 밀도에 걸쳐 일반화되어 장시간 시퀀스나 out-of-distribution 설정에서도 성능을 유지하는가?
RQ3GPA의 스칼라 및 바이벡터 성분이 동적 작업에서 학습된 근접 및 방향 상호 작용과 어떻게 관련되는가?
RQ4Recursive Rotor Accumulator가 혼돈 시스템에서 수치적 안정성을 유지하면서 선형 시간 재귀를 달성할 수 있는가?
RQ5Clifford 기반 시퀀스 모델의 실용적 지연 시간과 매개변수 효율성을 달성하기 위해 필요한 하드웨어 및 소프트웨어 최적화는 무엇인가?

주요 결과

Versor는 매개변수가 수십 배 적게 사용되며(약 Transformers보다 ≈200× 더 적음) 다양한 태스크에서 경쟁적이거나 우수한 성능을 달성한다.
Geometric Product Attention은 근접(스칼라) 및 방향성(바이벡터) 구성요소로 분해되어 해석 가능한 상호작용 법칙을 가능하게 한다.
Versor는 제로샷 규모 일반화를 달성하며, 예를 들어 토폴로지 연결성 태스크에서 MCC가 0.993인 반면 ViT는 0.070이다.
Recursive Rotor Accumulator는 O(L) 추론과 O(1) 메모리를 제공하여 수천 단계의 장기 다이나믹스를 가능하게 한다.
커스텀 Clifford 커널은 누적 약 100×의 속도향상을 제공하고 엔드투엔드 지연이 약 1.05 ms로 최적화된 Transformer 베이스라인을 능가한다.
분포 밖 테스트에서 Versor는 안정성을 유지하는 반면 Transformer 베이스라인은 재앙적으로 실패할 수 있다.

Figure 2 : Geometric Attention Decomposition: Separating Force from Torque. Points labeled B0–B4 represent the 5 gravitationally-interacting bodies; B0 is the focal body for this visualization. The axes ( $x_{1}$ , $x_{2}$ ) are the 2D physical coordinates of the simulation. Line weights are proport

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.