QUICK REVIEW

[論文レビュー] Identifiable Equivariant Networks are Layerwise Equivariant

Vahid Shahverdi, Giovanni Luca Marchetti|arXiv (Cornell University)|Jan 29, 2026

Advanced Graph Neural Networks被引用数 0

ひとこと要約

エンドツーエンドの同変性を持つ識別可能なパラメータを持つネットワークは必然的に層ごとに同変であり、潜在空間の作用はエンドツーエンドの対称性によって誘発される。

ABSTRACT

We investigate the relation between end-to-end equivariance and layerwise equivariance in deep neural networks. We prove the following: For a network whose end-to-end function is equivariant with respect to group actions on the input and output spaces, there is a parameter choice yielding the same end-to-end function such that its layers are equivariant with respect to some group actions on the latent spaces. Our result assumes that the parameters of the model are identifiable in an appropriate sense. This identifiability property has been established in the literature for a large class of networks, to which our results apply immediately, while it is conjectural for others. The theory we develop is grounded in an abstract formalism, and is therefore architecture-agnostic. Overall, our results provide a mathematical explanation for the emergence of equivariant structures in the weights of neural networks during training -- a phenomenon that is consistently observed in practice.

研究の動機と目的

ディープネットワークにおけるエンドツーエンド同変性と層別同変性の関係を動機づけ、形式化する。
入力/出力から潜在層への対称性移 transfers を可能にする識別性を鍵となる仮定として導入する。
識別性の下で、エンドツーエンド同変性を持つネットワークは層ごとに同変性を持つ層を持つことを示す。
MLPと注意機構に適用可能な抽象的でアーキテクチャには依存しないフレームワークを提供する。

提案手法

深層モデルを潜在空間の系列、層写像、パラメータとして定義し、サブモデルを形式化する。
潜在空間上の群作用とエンドツーエンドと層ごとの対称性を結ぶ付随性（adjunction）性を導入する。
識別性と弱識別性を定義し、グローバルな関数等価性を層間の対称性調整に関連づける。
エンドツーエンドのG同変性と弱識別性が成り立つとき、各層は潜在空間の作用とともにG同変であることを証明する。
理論をMLPと多頭注意ネットワークに適用し、潜在空間の絡み合い群の具体的な例を示す。

Figure 1 : An image segmentation model is equivariant to rotations of the image. Our main result implies that the group action on the input propagates through the network via latent symmetries (e.g., neuron permutations), until it reaches the output.

実験結果

リサーチクエスチョン

RQ1エンドツーエンド同変性を持つネットワークが存在するとき、識別性の下で潜在層における層別同変性を保証できるか。
RQ2MLPや注意ネットワークのような一般的なアーキテクチャにおける層ごとの対称性伝搬に必要な識別条件は何か。
RQ3付随性の性質は第一層と最終層の群作用をどのように制約して全体の同変性を保証するか。
RQ4トークンやヘッドの置換を対称性とする注意ベースのアーキテクチャへ理論をどう拡張するか。

主な発見

弱識別性とエンドツーエンドG同変性の下で、潜在空間の群作用が各層をG同変にする。
潜在作用はGから層の潜在空間への群準同型によって誘導され、層別同変性を保証する。
付随性が入力と出力の作用を第一層と最終層のパラメータ作用に結びつけ、エンドツーエンド対称性が内部へ浸透することを可能にする。
このフレームワークはアーキテクチャに依存せず、MLPと注意ベースのネットワークにも適用可能で、スキップ接続やReLU型識別性に関する実践的考慮も含む。
CIFAR-10での実証図は、学習されたフィルタと注意ヘッドが潜在同変性と整合する形で理論と一致することを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。