QUICK REVIEW

[논문 리뷰] Nonlinear Approximation and (Deep) ReLU Networks

Ingrid Daubechies, Ronald DeVore|arXiv (Cornell University)|2019. 05. 05.

Neural Networks and Applications참고 문헌 6인용 수 104

한 줄 요약

논문은 univariate 함수에 대한 심층 ReLU 네트워크의 표현력을 분석하고, 깊이가 free-knot linear splines를 넘어선 근사 이점을 제공함을 보여주며, 포함(containment)과 구성 가능성(compositional capabilities)을 증명하기 위한 특별한 네트워크 구성을 도입한다.

ABSTRACT

This article is concerned with the approximation and expressive powers of deep neural networks. This is an active research area currently producing many interesting papers. The results most commonly found in the literature prove that neural networks approximate functions with classical smoothness to the same accuracy as classical linear methods of approximation, e.g. approximation by polynomials or by piecewise polynomials on prescribed partitions. However, approximation by neural networks depending on n parameters is a form of nonlinear approximation and as such should be compared with other nonlinear methods such as variable knot splines or n-term approximation from dictionaries. The performance of neural networks in targeted applications such as machine learning indicate that they actually possess even greater approximation power than these traditional methods of nonlinear approximation. The main results of this article prove that this is indeed the case. This is done by exhibiting large classes of functions which can be efficiently captured by neural networks where classical nonlinear methods fall short of the task. The present article purposefully limits itself to studying the approximation of univariate functions by ReLU networks. Many generalizations to functions of several variables and other activation functions can be envisioned. However, even in this simplest of settings considered here, a theory that completely quantifies the approximation power of neural networks is still lacking.

연구 동기 및 목표

단일 변수 함수에 대한 심층 ReLU 네트워크의 근사 능력을 고전적 비선형 방법과 비교하여 평가한다.
매개변수 수가 대응되는 고정 폭의 심층 ReLU 네트워크가 free knot linear splines와 견줄 수 있으며 표현력에서 이를 능가할 수 있음을 보일 것이다.
깊이가 구성(composition)과 자기 유사 구조를 통해 효율적인 표현을 가능하게 하는 방식을 보여준다.
근사력을 보존하거나 강화하는 특별한 네트워크 구성을 도입하고 분석한다.
데이터 적합 편향에 대한 시사점과 잠재적인 실용적 아키텍처를 논의한다.

제안 방법

폭 W와 깊이 L를 갖는 ReLU 네트워크가 구현하는 함수 클래스를 Upsilon^{W,L}로 표시하여 정의한다.
Upsilon^{W,L}를 n 개의 브레이크포인트를 갖는 CPwL 함수들의 비선형 스플라인 클래스 Sigma_n과 비교한다.
두 층의 특별한 네트워크 구성을 사용하여 n(W,L)이 n에 비해 비슷한 값을 갖도록 Sigma_n이 Upsilon^{W,L}에 포함된다고 증명한다.
hat 함수와 주요 브레이크포인트를 통해 제어된 브레이크포인트를 갖는 CPwL 함수를 생성하는 두 층 구성(construction)을 개발한다.
CPwL 구성요소의 합과 합성에 확장하고 깊이가 표현력에 어떻게 기여하는지 분석한다.
특별 네트워크와 표준 네트워크에서 합성 및 합에 대한 안정성에 관한 이론적 결과를 제공한다.

실험 결과

연구 질문

RQ1고정 폭의 심층 ReLU 네트워크가 비교 가능한 매개변수 예산으로 free knot linear splines를 근사할 수 있는가?
RQ2CPwL 함수의 심층 구성들이 Sigma_n이 제공하는 표현력을 넘어서는 표현력을 산출하는가?
RQ3깊이가 폭과 어떻게 상호작용하여 고전적 비선형 방법에 비해 근사 능력에 영향을 미치는가?
RQ4구성적 포함(containment)과 추가 표현력을 보여줄 수 있는 어떤 네트워크 아키텍처(특별 네트워크)가 있는가?
RQ5깊은 ReLU 네트워크를 사용할 때 데이터 적합 편향에 대한 시사점은 무엇인가?

주요 결과

n 매개변수를 갖는 고정 폭 ReLU 네트워크는 Sigma_n 함수를 n(W,L)에서 상수 인자까지 근사할 수 있다.
깊이와 폭이 대략 n 개의 브레이크포인트를 표현 가능하도록 하는 Upsilon^{W,L}에 Sigma_n이 포함된다.
깊은 네트워크는 합성을 통해 많은 브레이크포인트를 갖는 함수를 생성할 수 있어 n에 연관된 다항growth보다 우수하다.
특별 네트워크 구성(SC 및 CC 채널)은 CPwL 함수와 그 합/합성을 구현하기 위한 프레임워크를 제공한다.
CPwL 함수의 합성은 매개변수 수를 제어하면서 한정된 폭과 깊이로 구현될 수 있다.
논문은 심층 ReLU 네트워크가 효율적으로 모방할 수 있는 자기유사(self-similar) 및 삼각함수 유사(trigonometric-like) 함수 클래스들을 확인한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.