QUICK REVIEW

[论文解读] Transformers learn factored representations

Adam Shai, Loren Amdahl-Culleton|arXiv (Cornell University)|Feb 2, 2026

Generative Adversarial Networks and Image Synthesis被引用 0

一句话总结

本论文表明，使用下一个词预测进行预训练的变换器会自发学习分解表征，将因子组织在正交子空间中；当因子条件独立时，能够实现指数级维度降维，并在忠实度受损时仍保留将因子分解的归纳偏好。

ABSTRACT

Transformers pretrained via next token prediction learn to factor their world into parts, representing these factors in orthogonal subspaces of the residual stream. We formalize two representational hypotheses: (1) a representation in the product space of all factors, whose dimension grows exponentially with the number of parts, or (2) a factored representation in orthogonal subspaces, whose dimension grows linearly. The factored representation is lossless when factors are conditionally independent, but sacrifices predictive fidelity otherwise, creating a tradeoff between dimensional efficiency and accuracy. We derive precise predictions about the geometric structure of activations for each, including the number of subspaces, their dimensionality, and the arrangement of context embeddings within them. We test between these hypotheses on transformers trained on synthetic processes with known latent structure. Models learn factored representations when factors are conditionally independent, and continue to favor them early in training even when noise or hidden dependencies undermine conditional independence, reflecting an inductive bias toward factoring at the cost of fidelity. This provides a principled explanation for why transformers decompose the world into parts, and suggests that interpretable low dimensional structure may persist even in models trained on complex data.

研究动机与目标

提出问题：变换器是否学会将世界分解为离散部分？
形式化两种表征假设（联合/分解）并推导几何预测。
在具有已知潜在结构的合成数据上测试变换器是否学习到分解表征。
评估分解是否为一种架构的归纳偏好，以及在条件独立性被违反时的表现。

提出的方法

建立将潜在数据结构与激活几何联系起来的理论框架，使用广义隐马尔可夫模型（GHMMs）。
定义预测向量并在联合与分解表示下分析其几何性质。
构造具有五个潜在因子的合成数据（三个3状态HMM和两个3D GHMM），并对GPT-2风格的变换器进行下一个词预测训练。
使用PCA和变差子空间分析来识别因子特定子空间并测试正交性。
通过加入带噪声的信道来引入对条件独立性的受控破坏，并观察对表示的影响。
在RNN/LSTM上重复实验以评估普适性。

实验结果

研究问题

RQ1变换器是在联合张量乘积空间中表示预测信息，还是在正交分解的子空间结构中？
RQ2在什么数据生成条件下，变换器倾向于学习分解表征？
RQ3分解是否是架构的归纳偏好，即使会降低预测保真度？
RQ4当因子之间的条件独立性减弱或被破坏时，表示如何变化？
RQ5基于RNN的架构是否也表现出类似的分解表征？

主要发现

当数据生成过程分解为条件独立的因子时，变换器学习分解表征并实现指数级的降维（联合空间需要 ∏d_n − 1 个维度，而分解需要 ∑(d_n − 1) 个维度）。
激活在N个正交子空间中组织，每个因子对应一个子空间，每个因子的预测向量位于其自己的子空间内。
分解表征在训练早期就出现，包括在嵌入层，并在条件独立性并非完美满足时也仍然存在。
当因子因噪声而变得不完全独立时，模型仍然偏好分解解，先停留在分解子空间以降低维数，再扩展维度以恢复保真度。
RNN/LSTM也表现出趋向分解表征，表明这是更广泛的架构现象。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。