QUICK REVIEW

[論文レビュー] Smoothness Adaptivity in Constant-Depth Neural Networks: Optimal Rates via Smooth Activations

Yuhao Liu, Zilin Wang|arXiv (Cornell University)|Feb 23, 2026

Stochastic Gradient Optimization Techniques被引用数 0

ひとこと要約

この論文は、滑らかな活性化を持つ定深度ニューラルネットワークがSobolev空間におけるミニマックス最適な近似・推定率を達成できる一方、定深度ReLUネットワークは深さ依存の適応性に制限があることを証明している。

ABSTRACT

Smooth activation functions are ubiquitous in modern deep learning, yet their theoretical advantages over non-smooth counterparts remain poorly understood. In this work, we study both approximation and statistical properties of neural networks with smooth activations for learning functions in the Sobolev space $W^{s,\infty}([0,1]^d)$ with $s>0$. We prove that constant-depth networks equipped with smooth activations achieve smoothness adaptivity: increasing width alone suffices to attain the minimax-optimal approximation and estimation error rates (up to logarithmic factors). In contrast, for non-smooth activations such as ReLU, smoothness adaptivity is fundamentally limited by depth: the attainable approximation order is bounded by depth, and higher-order smoothness requires proportional depth growth. These results identify activation smoothness as a fundamental mechanism, complementary to depth, for achieving optimal rates over Sobolev function classes. Technically, our analysis is based on a multi-scale approximation framework that yields explicit neural network approximators with controlled parameter norms and model size. This complexity control ensures statistical learnability under empirical risk minimization (ERM) and avoids the impractical $\ell^0$-sparsity constraints commonly required in prior analyses.

研究の動機と目的

活性化の滑らかさがSobolevターゲットに対するニューラルネットワークの近似性能へ与える影響を調査する。
滑らかな活性化を持つ定深度ネットワークは深さの成長なしにミニマックス最適レートを達成することを示す。
明示的な複雑性とノルム制御を伴う構成的なネットワーク近似スキームを提供する。
滑らかな活性化と非滑らかな活性化（ReLU）を対比し、適応性の深さボトルネックを明らかにする。

提案手法

区分的定数関数の多尺度近似フレームワークを開発し、ニューラル近似器を構築する。
定深度かつ幅とパラメータノルムを制御して、L2およびL∞近似結果を証明する。
局所的な近似を全体のL∞境界へ拡張する重み付き重ね合わせ原理を確立する。
正則化経験的リスク最小化（ERM）に対する一般化保証を導出し、滑らかな活性化でスパース性制約なしにミニマックス最適レートを示す。
定深度ReLUネットワークに対する深さボトルネック下界を提供し、固有の制約を示す。

Figure 1 : Generalization error versus sample size for two-layer networks trained with different activation functions. Markers denote the measured generalization errors at each sample size (averaged over 5 runs), and solid lines show least-squares fits of the form $E(n)\propto n^{-\alpha}$ . The fit

実験結果

リサーチクエスチョン

RQ1滑らかな活性化を持つ定深度ニューラルネットワークは[0,1]^d上で任意に高い滑らさに適応できるのか。
RQ2このようなネットワークはスパース性制約なしにERMの下でミニマックス最適推定レートを達成するのか。
RQ3ReLUのような非滑らかな活性化は、深さ要件と滑らかさへの適応性の観点でどのように比較されるのか。
RQ4最適な近似と学習を保証するために十分な複雑性制御（幅とノルム）は何か。

主な発見

滑らかな活性化を持つ定深度ネットワークはf ∈ W^{s,∞}([0,1]^d)に対する最適なO(N^{-s/d})近似レートを、L=6と多項式に抑えられたノルムで達成する。
これらのネットワーク上のERMは、対数因子を除けばミニマックス最適のO(n^{-2s/(2s+d)})推定レートを達成する。
定深度ReLUネットワークには深さボトルネックが証明されており、近似レートはN^{-min{L-1,s"}}で飽和する；より高い滑らかさにはより深いネットワークが必要となる。
滑らかな活性化を用いると固定深度で滑らかなターゲットの学習において一般化が速くなることを実証する。
結果はSobolev空間における滑らかさの活性化が深さの代替として機能しうることを示す。

Figure 2 : Illustration of the approximator construction for $f^{\star}$ in Theorem B.19 with $d=1$ and $K=2$ . (a) Approximate $f^{\star}$ by piecewise polynomials, realized as the product of global polynomials and piecewise constant functions. (b) The $4$ -piece piecewise constant function on refi

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。