QUICK REVIEW

[论文解读] Reliably Learning the ReLU in Polynomial Time

Surbhi Goel, Varun Kanade|arXiv (Cornell University)|Nov 30, 2016

Neural Networks and Applications被引用 54

一句话总结

本文提出了首个在可靠模糊学习模型下可靠学习修正线性单元（ReLUs）的多项式时间算法，其中标签可能受到对抗性污染。通过结合核方法、多项式逼近以及双损失凸优化框架，该算法在任意凸、有界且Lipschitz的损失函数下，实现了对假阳性和回归损失的最佳权衡，误差容限 ε = Ω(1/log n)。

ABSTRACT

We give the first dimension-efficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form $\mathbf{x} \mapsto \max(0, \mathbf{w} \cdot \mathbf{x})$ with $\mathbf{w} \in \mathbb{S}^{n-1}$. Our algorithm works in the challenging Reliable Agnostic learning model of Kalai, Kanade, and Mansour (2009) where the learner is given access to a distribution $\cal{D}$ on labeled examples but the labeling may be arbitrary. We construct a hypothesis that simultaneously minimizes the false-positive rate and the loss on inputs given positive labels by $\cal{D}$, for any convex, bounded, and Lipschitz loss function. The algorithm runs in polynomial-time (in $n$) with respect to any distribution on $\mathbb{S}^{n-1}$ (the unit sphere in $n$ dimensions) and for any error parameter $ε= Ω(1/\log n)$ (this yields a PTAS for a question raised by F. Bach on the complexity of maximizing ReLUs). These results are in contrast to known efficient algorithms for reliably learning linear threshold functions, where $ε$ must be $Ω(1)$ and strong assumptions are required on the marginal distribution. We can compose our results to obtain the first set of efficient algorithms for learning constant-depth networks of ReLUs. Our techniques combine kernel methods and polynomial approximations with a "dual-loss" approach to convex programming. As a byproduct we obtain a number of applications including the first set of efficient algorithms for "convex piecewise-linear fitting" and the first efficient algorithms for noisy polynomial reconstruction of low-weight polynomials on the unit sphere.

研究动机与目标

解决尽管浅层ReLu网络在深度学习中广泛应用，但在其计算复杂性方面仍存在的空白。
通过引入一种专为ReLUs设计的新学习模型，克服在对抗性标注下学习阈值函数的不可解性。
开发一种高效且维度可扩展的算法，在任意标签噪声下最小化假阳性与回归损失。
建立首个用于学习常数深度ReLu网络及凸分段线性拟合的高效算法。
提供一种在单位球面上对低权重多项式进行噪声多项式重构的框架。

提出的方法

在可靠模糊模型中构建学习问题，该模型在正样本的假阳性控制与损失最小化之间实现平衡。
采用双损失目标，同时最小化假阳性率与正样本上的凸、有界、Lipschitz损失。
应用核方法将输入映射到再生核希尔伯特空间，以实现有效的函数逼近。
运用多项式逼近技术，将ReLU函数及其双损失目标表示为计算上可处理的形式。
设计一种凸优化框架，通过半定规划或相关凸松弛方法，在多项式时间内求解双损失问题。
利用单位球面（S^{n-1}）的结构，确保学习过程在维度上高效，且独立于输入分布。

实验结果

研究问题

RQ1在存在任意标签噪声的情况下，是否可以高效学习ReLU函数，而无需强分布假设？
RQ2是否可能在一个学习框架中同时最小化ReLU的假阳性错误与回归损失？
RQ3可靠学习ReLU的计算复杂度是多少？当 ε = o(1) 时，是否可在多项式时间内实现？
RQ4所提出的框架能否扩展至学习更深的ReLU网络，或解决相关问题如凸分段线性拟合？
RQ5ReLU的可靠学习是否意味着在其他难题（如带噪声的稀疏奇偶性学习或DNF公式学习）上取得突破？

主要发现

所提出的算法在单位球面 S^{n-1} 上任意分布下均以 n 的多项式时间运行，误差 ε = Ω(1/log n)，为ReLU最大化问题建立了PTAS。
该算法同时最小化假阳性率与任意凸、有界、Lipschitz损失函数，实现了在对抗性标注下的鲁棒权衡。
该框架通过将问题约化为ReLU学习，首次实现了凸分段线性拟合的高效算法。
该框架实现了在单位球面上对低权重多项式进行首次高效的噪声多项式重构。
该算法可通过组合方式用于学习常数深度ReLU网络，扩展了其在更深架构中的适用性。
在稀疏学习奇偶性带噪声难题假设下，任何多项式时间算法均无法在 {0,1}^n 上可靠学习ReLU，且满足 ℓ1(w) ≤ 2k，表明该结果近乎紧致。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。