QUICK REVIEW

[論文レビュー] Pruning-induced phases in fully-connected neural networks: the eumentia, the dementia, and the amentia

Haining Pan, Nakul Aggarwal|arXiv (Cornell University)|Mar 12, 2026

Quantum many-body systems被引用数 0

ひとこと要約

この論文は、独立したトレーニングと評価のドロップアウト下で完全連結ニューラルネットワークの位相図を対応づける。3つの位相（eumentia, dementia, amentia）を特定し、剪定のようなドロップアウトによって駆動されるeumentiaとdementiaの間のBKT様転移の証拠を示す。

ABSTRACT

Modern neural networks are heavily overparameterized, and pruning, which removes redundant neurons or connections, has emerged as a key approach to compressing them without sacrificing performance. However, while practical pruning methods are well developed, whether pruning induces sharp phase transitions in the neural networks and, if so, to what universality class they belong, remain open questions. To address this, we study fully-connected neural networks trained on MNIST, independently varying the dropout (i.e., removing neurons) rate at both the training and evaluation stages to map the phase diagram. We identify three distinct phases: eumentia (the network learns), dementia (the network has forgotten), and amentia (the network cannot learn), sharply distinguished by the power-law scaling of the cross-entropy loss with the training dataset size. {In the eumentia phase, the algebraic decay of the loss, as documented in the machine learning literature as neural scaling laws, is from the perspective of statistical mechanics the hallmark of quasi-long-range order.} We demonstrate that the transition between the eumentia and dementia phases is accompanied by scale invariance, with a diverging length scale that exhibits hallmarks of a Berezinskii-Kosterlitz-Thouless-like transition; the phase structure is robust across different network widths and depths. Our results establish that dropout-induced pruning provides a concrete setting in which neural network behavior can be understood through the lens of statistical mechanics.

研究の動機と目的

過parameter化されたニューラルネットワークにおける剪定の物理学インスパイアド理解を促進する。
トレーニングドロップアウト率と評価ドロップアウト率を独立に変化させて位相図をマッピングする。
訓練データサイズの増減に対するクロスエントロピー損失のスケーリングによって位相を特徴づける。
eumentia→dementia転換がスケール不変性とBKT様挙動を示すかを調査する。

提案手法

MNIST上で幅と深さを変えた全結合ニューラルネットを訓練する。
訓練時と評価時に独立してニューロンのドロップアウト率p_trainとp_evalを適用する。
固定されたテストセットに対して複数回の実験でテストクロスエントロピー損失L_CEと精度Aを測定する。
データセットサイズNに対するL_CEのべき法則を分析する：L_CE ~ N^{alpha_CE}。
一般化スケーリング形に基づくBKT様転移を検証するための有限サイズスケーリングを実施する。
頑健性のため従来の二次スケーリングと比較する。

Figure 1: (a) Schematic of the fully-connected neural network. A $28\times 28$ gray scale image from MNIST dataset is first flattened into a $784$ -dimensional input vector, and then passed through a rectangular architecture of depth $L$ hidden layers and width $n_{h}$ , where each hidden layer cont

実験結果

リサーチクエスチョン

RQ1剪定のようなドロップアウトがFCNNに鋭い位相転移を生み出すか。
RQ2ドロップアウト下でデータセットサイズに対するクロスエントロピーのスケーリングに基づく独自の位相は何か。
RQ3eumentia→dementia転移はBKT様でスケール不変か。
RQ4ネットワークの幅と深さの変化に対して位相境界は頑健か。
RQ5ドロップアウト誘導の位相は統計機械論的枠組みで理解できるか。

主な発見

3つの明確な位相が同定される：eumentia(alpha_CE<0)、dementia(alpha_CE>0)、amentia(alpha_CE≈0)。
位相境界は鋭く、データセットサイズスケーリングの指数alpha_CEによって追跡可能。
eumentia–dementia転移はスケール不変性の証拠を示し、長が発散する長距離スケールを伴うBKT様転移（sigma ≈ 0.8–1.0）と整合。
位相構造は異なるネットワーク幅と深さ(n_hとL)に対して頑健。
eumentia相のクロスエントロピーはデータに対して代数的に減衰し、準長距離秩序として解釈される。dementiaは評価ドロップアウト下でデータが増えるにつれて悪化。

Figure 2: (a) Phase diagram as a function of the training dropout rate $p_{\text{train}}$ and evaluation dropout rate $p_{\text{eval}}$ . The three phases eumentia, dementia, and amentia are characterized by the exponent of power-law decay in the cross-entropy loss as shown in panel (c); (b) Average

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。