QUICK REVIEW

[논문 리뷰] Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees

Vitaly Feldman, Pravesh K. Kothari|arXiv (Cornell University)|2013. 04. 02.

Complexity and Algorithms in Graphs인용 수 22

한 줄 요약

이 논문은 부울 하이퍼큐브 상의 임의의 하위모듈라 함수가 $\epsilon$-근사화될 수 있음을 보이며, 이는 $\ell_2$-노름에서 깊이 $O(1/\epsilon^2)$인 실수 값 결정트리에 의해 이루어지며, 이로 인해 런타임 $\tilde{O}(n^2) \cdot 2^{O(1/\epsilon^4)}$를 갖는 효율적인 학습 알고리즘을 도출한다. 이 결과는 균일 분포 하에서 하위모듈라 함수를 학습하는 데 있어 정보 이론적 및 계산적 하한을 처음으로 제공한다.

ABSTRACT

We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube $\{0,1\}^n$. Our main result is the following structural theorem: any submodular function is $ε$-close in $\ell_2$ to a real-valued decision tree (DT) of depth $O(1/ε^2)$. This immediately implies that any submodular function is $ε$-close to a function of at most $2^{O(1/ε^2)}$ variables and has a spectral $\ell_1$ norm of $2^{O(1/ε^2)}$. It also implies the closest previous result that states that submodular functions can be approximated by polynomials of degree $O(1/ε^2)$ (Cheraghchi et al., 2012). Our result is proved by constructing an approximation of a submodular function by a DT of rank $4/ε^2$ and a proof that any rank-$r$ DT can be $ε$-approximated by a DT of depth $\frac{5}{2}(r+\log(1/ε))$. We show that these structural results can be exploited to give an attribute-efficient PAC learning algorithm for submodular functions running in time $ ilde{O}(n^2) \cdot 2^{O(1/ε^{4})}$. The best previous algorithm for the problem requires $n^{O(1/ε^{2})}$ time and examples (Cheraghchi et al., 2012) but works also in the agnostic setting. In addition, we give improved learning algorithms for a number of related settings. We also prove that our PAC and agnostic learning algorithms are essentially optimal via two lower bounds: (1) an information-theoretic lower bound of $2^{Ω(1/ε^{2/3})}$ on the complexity of learning monotone submodular functions in any reasonable model; (2) computational lower bound of $n^{Ω(1/ε^{2/3})}$ based on a reduction to learning of sparse parities with noise, widely-believed to be intractable. These are the first lower bounds for learning of submodular functions over the uniform distribution.

연구 동기 및 목표

효율적인 근사화와 학습을 위해 하위모듈라 함수의 구조적 특성화를 낮은 랭크의 결정트리를 통해 제공하는 것.
균일 분포 하에서 하위모듈라 함수에 대해 이전 작업보다 향상된 런타임을 갖는 속성 효율적인 PAC 학습 알고리즘을 설계하는 것.
하위모듈라 함수 학습을 위한 정보 이론적 및 계산적 하한을 처음으로 확립하여 제안된 알고리즘이 최적임을 보여주는 것.
결정트리 랭크와 스펙트럼 성질을 활용하여 근사 이론과 학습 알고리즘 간 격차를 메우는 것.

제안 방법

저자들은 복잡도를 측정하는 랭크 기반 척도를 사용하여 하위모듈라 함수를 낮은 랭크의 결정트리로 표현하는 분해 기법을 도입한다.
모든 하위모듈라 함수가 랭크가 $4/\epsilon^2$ 이하인 결정트리에 $\ell_2$-노름에서 $\epsilon$-근접함을 증명한다.
핵심 기술적 구성 요소는 임의의 랭크-$r$ 결정트리가 깊이 $\frac{5}{2}(r + \log(1/\epsilon))$인 결정트리에 $\epsilon$-근사화될 수 있음을 보이는 것이다.
학습 알고리즘은 이 구조적 결과를 활용하여 표본 추출과 임계값 설정을 통해 가설 함수를 구성하며, 런타임 $\tilde{O}(n^2) \cdot 2^{O(1/\epsilon^4)}$를 달성한다.
하한은 노이즈가 있는 희박한 파리티를 학습하는 문제로의 감소를 통해 유도된다. 이는 널리 알려진 비결정성 문제로 간주된다.
임의의 부울 함수로부터 단조 하위모듈라 함수를 구성하는 것의 가능성을 통해 감소를 확립하여 하한을 도출한다.

실험 결과

연구 질문

RQ1하위모듈라 함수는 낮은 랭크의 결정트리에 의해 효율적으로 근사화될 수 있는가?
RQ2$\ell_2$-노름에서 임의의 하위모듈라 함수를 $\epsilon$-근사화하기 위해 필요한 결정트리의 최소 깊이 또는 랭크는 무엇인가?
RQ3이러한 근사화는 하위모듈라 함수에 대해 더 효율적인 PAC 학습 알고리즘을 이끌 수 있는가?
RQ4균일 분포 하에서 하위모듈라 함수를 학습하는 데에는 본질적인 제약이 존재하는가?
RQ5질의 복잡도 또는 런타임 측면에서 하위모듈라 함수 학습을 위한 날카로운 하한을 확립할 수 있는가?

주요 결과

모든 하위모듈라 함수는 깊이 $O(1/\epsilon^2)$인 실수 값 결정트리에 $\ell_2$-노름에서 $\epsilon$-근접한다.
동일한 결과는 하위모듈라 함수가 최대 $2^{O(1/\epsilon^2)}$개의 변수를 갖는 함수에 의해 근사 가능하며, 스펙트럼 $\ell_1$-노름이 $2^{O(1/\epsilon^2)}$ 이하로 제한됨을 암시한다.
제안된 PAC 학습 알고리즘은 $\tilde{O}(n^2) \cdot 2^{O(1/\epsilon^4)}$의 런타임을 갖는다. 이는 이전의 $n^{O(1/\epsilon^2)}$ bound 보다 향상된 것이다.
단조 하위모듈라 함수를 학습하기 위한 정보 이론적 하한으로 $2^{\Omega(\epsilon^{-2/3})}$개의 값 질의가 필요함을 확립한다.
노이즈가 있는 희박한 파리티를 학습하는 문제로의 감소를 통해 $n^{\Omega(\epsilon^{-2/3})}$의 계산적 하한을 증명한다.
이 논문은 균일 분포 하에서 하위모듈라 함수 학습을 위한 첫 번째 하한을 제공하며, 제안된 알고리즘이 거의 최적임을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.