QUICK REVIEW

[論文レビュー] Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models

Bo Wang, Junzhuo Li|arXiv (Cornell University)|Jan 13, 2026

Generative Adversarial Networks and Image Synthesis被引用数 0

ひとこと要約

The paper introduces Gated-LPI to attribute knowledge at the neuron level and time-resolve MoE versus dense Transformers during pre-training, revealing MoE’s stable, distributed backbone and robustness.

ABSTRACT

Mixture-of-Experts (MoE) architectures decouple model capacity from per-token computation, enabling scaling beyond the computational limits imposed by dense scaling laws. Yet how MoE architectures shape knowledge acquisition during pre-training, and how this process differs from dense architectures, remains unknown. To address this issue, we introduce Gated-LPI (Log-Probability Increase), a neuron-level attribution metric that decomposes log-probability increase across neurons. We present a time-resolved comparison of knowledge acquisition dynamics in MoE and dense architectures, tracking checkpoints over 1.2M training steps (~ 5.0T tokens) and 600K training steps (~ 2.5T tokens), respectively. Our experiments uncover three patterns: (1) Low-entropy backbone. The top approximately 1% of MoE neurons capture over 45% of positive updates, forming a high-utility core, which is absent in the dense baseline. (2) Early consolidation. The MoE model locks into a stable importance profile within < 100K steps, whereas the dense model remains volatile throughout training. (3) Functional robustness. Masking the ten most important MoE attention heads reduces relational HIT@10 by < 10%, compared with > 50% for the dense model, showing that sparsity fosters distributed -- rather than brittle -- knowledge storage. These patterns collectively demonstrate that sparsity fosters an intrinsically stable and distributed computational backbone from early in training, helping bridge the gap between sparse architectures and training-time interpretability.

研究の動機と目的

MoE と dense モデルにおける pre-training 中の知識獲得がアーキテクチャの違いによってどう形作られるかを理解する。
neuronレベルの attribution 手法（Gated-LPI）を MoE アーキテクチャへ拡張する。
トレーニングステップごとに FFN と attention 層を横断した知識獲得のダイナミクスを特徴付ける。
対象コンポーネントのアブレーションに対する MoE と dense モデルの機能的頑健性を評価する。

提案手法

MoE に Log-Probability Increase (LPI) を拡張し、Expert Neurons および Attention Neurons を定義して出力寄与を計算する（Equation 3 および Equation 4）。
neurонの重要度 I(v) を、出力_v が追加されたときのターゲットトークンの log-probability の増加として定義する（Equation 5）。
OLMoE-1B-7B（MoE）と OLMo-7B（dense）の pre-training チェックポイントをそれぞれ 1.2M ステップと 600K ステップで追跡し、 neuron- および layer-level の安定性と頑健性を測定する。
安定性と分布を評価する指標：Top-1% Set Stability (J_stab)、Positive-Gain Concentration (R_t)、Layer-Distribution Consistency (ρ_avg)、Cross-step Coefficient of Variation (σ_rel)。
Top-1 ヘッド、Top-10 ヘッド、Top-1% ネuron のアブレーションを評価して HIT@10 の変化を観察する。

Figure 1: Top-1% FFN and ATTN neurons Jaccard overlap between consecutive checkpoints.

実験結果

リサーチクエスチョン

RQ1MoE アーキテクチャは pre-training 中に dense モデルと比べて知識 attribution ダイナミクスを示すのか？
RQ2ゲート付き attribution アプローチは MoE の安定で分散された知識核を、dense ベースラインと異なる形で明らかにできるのか？
RQ3FFN と attention コンポーネントは MoE と dense トランスフォーマーの知識獲得に時間とともてどのように寄与するのか？
RQ4 MoE の学習知識はターゲットとなるニューロン/ヘッドのアブレーションに対して dense モデルより頑健か？

主な発見

Model	Top-1 head	Top-10 heads	Top-1% FFN neurons Ablation Impact
OLMoE	0.06%	9.44%	35.47%
OLMo	16.46%	50.43%	96.19%

MoE は低エントロピーのバックボーンを発展させ、トップ約1% の MoE ニューロンが正の更新の >45% を捕捉するコアを形成するが、dense ベースラインにはこのコアは見られない。
MoE は早期の統合を達成し、FFN および ATTN 層の重要度プロファイルが <100K ステップで安定、dense モデルは不安定なままである。
MoE の Top-1% FFN ニューロンをアブレートすると HIT@10 が約35%低下、Top-10 ヘッドをアブレートすると約9–10%低下する一方、dense モデルでは約50%および約96%の低下を示し、MoE には知識の分散保存があることを示唆する。
MoE の Attention ニューロンは安定性を持続させる一方、dense の attention 層は継続的な再構成と churn を示す。
層レベルの指標は MoE FFN の重要性が急速に安定する（ρ_avg ≈ 0.97; σ_rel ≈ 0.37）、dense FFN の安定性はより弱い（ρ_avg ≈ 0.54; σ_rel ≈ 5.01）。
機能的頑健性の分析は、MoE の安定した分散コアがアブレーションに対して頑健であることを支持し、dense モデルの脆く集中した知識とは異なることを示す。

Figure 2: Mean FFN and ATTN importance scores across all layers over training steps. OLMoE shows smoother and earlier stabilization compared to OLMo.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。