QUICK REVIEW

[論文レビュー] A Universal Approximation Theorem of Deep Neural Networks for Expressing Probability Distributions

Yulong Lu, Jianfeng Lu|arXiv (Cornell University)|Apr 19, 2020

Generative Adversarial Networks and Image Synthesis参考文献 56被引用数 77

ひとこと要約

この論文は、ReLU深層ニューラルネットワークが1-Wasserstein、MMD、またはKernelized Stein Discrepancyの下で、ソース分布をターゲット分布へ任意に良く近づけるPush-forwardを実現できることを、ネットワークサイズの明示的な境界と共に示す。選択した距離指標に依存。

ABSTRACT

This paper studies the universal approximation property of deep neural networks for representing probability distributions. Given a target distribution $\\pi$ and a source distribution $p_z$ both defined on $\\mathbb{R}^d$, we prove under some assumptions that there exists a deep neural network $g:\\mathbb{R}^d\ ightarrow \\mathbb{R}$ with ReLU activation such that the push-forward measure $(\ abla g)_\\# p_z$ of $p_z$ under the map $\ abla g$ is arbitrarily close to the target measure $\\pi$. The closeness are measured by three classes of integral probability metrics between probability distributions: $1$-Wasserstein distance, maximum mean distance (MMD) and kernelized Stein discrepancy (KSD). We prove upper bounds for the size (width and depth) of the deep neural network in terms of the dimension $d$ and the approximation error $\\varepsilon$ with respect to the three discrepancies. In particular, the size of neural network can grow exponentially in $d$ when $1$-Wasserstein distance is used as the discrepancy, whereas for both MMD and KSD the size of neural network only depends on $d$ at most polynomially. Our proof relies on convergence estimates of empirical measures under aforementioned discrepancies and semi-discrete optimal transport.

研究の動機と目的

ニューラルネットを生成器として用い、機能近似を超える確率分布を表現する動機づけ。
ReLU DNN はネットワーク出力の勾配による push-forward を介してソース分布をターゲット分布へ近似できることを示す。
3つのIPMの下で所望の近似精度を達成するための定量的な複雑性境界（深さ/幅）を提供する。
経験的測度の収束と半離散最適輸送を結びつけ、明示的なニューラルネットワークベースの輸送マップを構築する。

提案手法

IPMの下で p_z の勾配が π に近づくよう、ニューラルネットワークベースのポテンシャル u を構築する。
経験的測度 P_n を用いて π を近似し、W1、MMD、および KSD に対して d_FD(P_n, π) を境界付ける。
半離散最適輸送を適用して、連続 μ を離散 ν に押す最適写像 T = ∇φ を示す。φ はアフィン関数の最大である。
φ を max_j{x·y_j + m_j} が DNN によって表現できるという結果を用いてニューラルネットワークで表現。
DNN の深さ L = ⌈log2 n⌉ および幅 N = 2^L を明示的に提供し、標的指標がサンプルサイズ n に依存する。

実験結果

リサーチクエスチョン

RQ1深層 ReLU ネットワークは、ニューラルネットワーク定義ポテンシャルの勾配を介して基底分布 p_z から π の push-forward として所与のターゲット分布を表現できるか。
RQ2IPM の選択（Wasserstein、MMD、KSD）は、所望の近似誤差 ε を達成するために必要なネットワークサイズにどう影響するか。
RQ3各 IPM の下で π を近似するためのネットワークの深さ/幅（複雑さ）の定量的境界はどれくらいか。
RQ4半離散最適輸送をどのように利用してニューラルネットワークベースの輸送マップを構築できるか。
RQ5評価に用いる三つの IPM のもとで経験的測度の収束速度はどうなるか。

主な発見

深い ReLU ニューラルネットワークで d 入力と 1 出力を持つものが、 gradient マップによる推進で選択した IPM において π へ ε 離れて近づけることができる。
1-距離の Wasserstein に対して、必要なネットワークサイズ n は d=1 で C/ε^2、d=2 で C log^2(ε)/ε^2、d≥3 で C^d/ε^d（有限三次モーメントの下）とスケールする。
MMD に対して、Assumption K2 を満たすカーネルを用いた場合、n ≤ C/ε^2。
KSD に対して、Assumption K3 を満たし π が Assumptions 1 と 2 を満たす場合、n ≤ C d/ε^2。
半離散最適化を達成する輸送写像は、連続的な μ を離散 ν に押す勾配であり、これは DNN によって正確に表現可能な区分的アフィン関数の勾配である。
ニューラルネットワークベースのポテンシャル φ(x) = max_j{x·y_j + m_j} は、深さ ⌈log n⌉、幅 2^⌈log n⌉ の DNN によって実現できる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。