QUICK REVIEW

[论文解读] Optimal Bounds on Approximation of Submodular and XOS Functions by Juntas

Vitaly Feldman, Jan Vondrák|arXiv (Cornell University)|Jul 12, 2013

Machine Learning and Algorithms参考文献 44被引用 25

一句话总结

本文在均匀分布下建立了子模函数和XOS函数通过junta（仅依赖于少量变量的函数）进行近似的紧致界。结果表明，[0,1]范围内取值的子模函数可在ℓ₂-范数下由大小为Õ(1/ε²)的junta实现ε-近似，相比之前的工作实现了指数级改进；而XOS函数则需要2^{O(1/ε²)}个变量，与Friedgut定理在实值函数上的推广相一致。

ABSTRACT

We investigate the approximability of several classes of real-valued functions by functions of a small number of variables ({\em juntas}). Our main results are tight bounds on the number of variables required to approximate a function $f:\{0,1\}^n ightarrow [0,1]$ within $\ell_2$-error $ε$ over the uniform distribution: 1. If $f$ is submodular, then it is $ε$-close to a function of $O(\frac{1}{ε^2} \log \frac{1}ε)$ variables. This is an exponential improvement over previously known results. We note that $Ω(\frac{1}{ε^2})$ variables are necessary even for linear functions. 2. If $f$ is fractionally subadditive (XOS) it is $ε$-close to a function of $2^{O(1/ε^2)}$ variables. This result holds for all functions with low total $\ell_1$-influence and is a real-valued analogue of Friedgut's theorem for boolean functions. We show that $2^{Ω(1/ε)}$ variables are necessary even for XOS functions. As applications of these results, we provide learning algorithms over the uniform distribution. For XOS functions, we give a PAC learning algorithm that runs in time $2^{poly(1/ε)} poly(n)$. For submodular functions we give an algorithm in the more demanding PMAC learning model (Balcan and Harvey, 2011) which requires a multiplicative $1+γ$ factor approximation with probability at least $1-ε$ over the target distribution. Our uniform distribution algorithm runs in time $2^{poly(1/(γε))} poly(n)$. This is the first algorithm in the PMAC model that over the uniform distribution can achieve a constant approximation factor arbitrarily close to 1 for all submodular functions. As follows from the lower bounds in (Feldman et al., 2013) both of these algorithms are close to optimal. We also give applications for proper learning, testing and agnostic learning with value queries of these classes.

研究动机与目标

确定在均匀分布下，能以ℓ₂-范数ε-近似子模函数的最小junta大小。
通过刻画总ℓ₁-影响较小的函数类，将Friedgut定理从布尔函数推广至实值函数。
为通过junta近似XOS函数建立最优界，并证明这些界是紧致的。
基于junta近似结果，设计子模函数与XOS函数的高效学习算法。
探讨这些结果对子模函数与XOS类的正确学习、测试及鲁棒学习的启示。

提出的方法

通过新颖的结构分析，证明取值范围为[0,1]的子模函数在ℓ₂-范数下与仅依赖于O(1/ε² log(1/ε))个变量的函数ε-接近。
通过证明任意总ℓ₁-影响有界的函数均可由2^{O(1/ε²)}-junta实现ε-近似，将Friedgut定理推广至实值函数。
建立下界：即使对于XOS函数，也至少需要2^{Ω(1/ε)}个变量，从而证明上界的紧致性。
利用junta近似结果，设计出子模函数的PMAC学习算法，运行时间为2^{1/poly(γε)} poly(n)，实现(1+γ)-近似。
利用取值查询预言机与稀疏ℓ₁-回归技术，实现对有界ℓ₁-影响函数的鲁棒学习，误差为ℓ₁-范数。
应用测度集中与基于影响的分解技术，控制junta近似中相关变量的数量。

实验结果

研究问题

RQ1在均匀分布下，以ℓ₂-范数ε-近似子模函数的最优junta大小是多少？
RQ2能否将Friedgut定理从布尔函数推广至具有低总ℓ₁-影响的实值函数？
RQ3XOS函数的2^{O(1/ε²)}-junta界是否紧致？实现ε-近似的最小大小是多少？
RQ4能否以junta近似为核心组件，为子模函数与XOS函数设计高效的机器学习算法？
RQ5这些界对子模函数与XOS类的正确学习、测试及鲁棒学习有何影响？

主要发现

取值范围为[0,1]的子模函数在ℓ₂-范数下与大小为Õ(1/ε²)的juntaε-接近，该界在对数因子内最优。
XOS函数在ℓ₂-范数下与2^{O(1/ε²)}-juntaε-接近，且该界在指数部分常数范围内是紧致的。
即使对于XOS函数，也需至少2^{Ω(1/ε)}个变量，证明2^{O(1/ε²)}上界在渐近意义下是紧致的。
本文首次提出子模函数的PMAC学习算法，可实现任意接近1的(1+γ)-近似因子，运行时间为2^{1/poly(γε)} poly(n)。
在鲁棒学习方面，本文表明总ℓ₁-影响不超过a的函数可在时间poly(n) · 2^{O(a²/ε⁴)}内通过取值查询实现学习。
结果将Friedgut定理推广至实值函数，表明低影响函数可由大小为2^{O(1/ε²)}的junta近似。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。