QUICK REVIEW

[论文解读] Gradient Methods for Submodular Maximization

Hamed Hassani, Mahdi Soltanolkotabi|arXiv (Cornell University)|Aug 13, 2017

Stochastic Gradient Optimization Techniques参考文献 26被引用 28

一句话总结

本文证明了在凸约束下，投影梯度上升方法可为连续子模函数的最大化提供强大的近似保证。它证明了所有单调、DR-子模函数的不动点均能提供全局最优解的 1/2 近似解，且随机/投影梯度方法可在 $\mathcal{O}(1/\epsilon^2)$ 次迭代内收敛至距离 $\text{OPT}/2$ 为 $\epsilon$ 以内的解，从而通过连续松弛实现了对随机和离散子模问题的高效优化。

ABSTRACT

In this paper, we study the problem of maximizing continuous submodular functions that naturally arise in many learning applications such as those involving utility functions in active learning and sensing, matrix approximations and network inference. Despite the apparent lack of convexity in such functions, we prove that stochastic projected gradient methods can provide strong approximation guarantees for maximizing continuous submodular functions with convex constraints. More specifically, we prove that for monotone continuous DR-submodular functions, all fixed points of projected gradient ascent provide a factor $1/2$ approximation to the global maxima. We also study stochastic gradient and mirror methods and show that after $\mathcal{O}(1/ε^2)$ iterations these methods reach solutions which achieve in expectation objective values exceeding $(\frac{ ext{OPT}}{2}-ε)$. An immediate application of our results is to maximize submodular functions that are defined stochastically, i.e. the submodular function is defined as an expectation over a family of submodular functions with an unknown distribution. We will show how stochastic gradient methods are naturally well-suited for this setting, leading to a factor $1/2$ approximation when the function is monotone. In particular, it allows us to approximately maximize discrete, monotone submodular optimization problems via projected gradient descent on a continuous relaxation, directly connecting the discrete and continuous domains. Finally, experiments on real data demonstrate that our projected gradient methods consistently achieve the best utility compared to other continuous baselines while remaining competitive in terms of computational effort.

研究动机与目标

为基于梯度的方法在最大化连续子模函数方面的经验成功提供理论依据。
在存在凸约束和非凸、子模目标函数的情况下，建立投影梯度上升的近似保证。
将这些保证扩展至大规模或噪声子模优化中的随机和镜像下降变体。
通过证明在多线性扩展上使用投影梯度下降可得到可证明良好的解，从而弥合离散与连续子模优化之间的鸿沟。
分析收敛速率与 $L_2$、$L_*$ 以及子模性比率 $\gamma$ 的平滑度依赖关系。

提出的方法

在有界凸集 $\mathcal{K}$ 上对连续、单调、DR-子模函数使用投影梯度上升。
证明了此类函数的所有驻点均能提供对全局最优解的 $1/2$ 近似，即使函数是非凸的。
应用具有无偏梯度估计的随机梯度下降，证明其可在 $\mathcal{O}(L_2/\epsilon + \sigma^2/\epsilon^2)$ 次迭代内收敛至 $\text{OPT}/2 - \epsilon$ 以内。
引入基于 Bregman 散度的镜像下降，实现 $\mathcal{O}(L_*/\epsilon + \sigma^2/\epsilon^2)$ 次迭代内达到相同的 $\text{OPT}/2 - \epsilon$ 保证。
利用平滑度与弱 DR-子模性（以 $\gamma$ 参数化）将 $1/2$ 的保证推广至弱子模函数的 $\gamma^2/(1+\gamma^2)$。
通过 Bregman 散度与期望次优性推导收敛边界，利用势函数 $\Phi$ 的强凸性。

实验结果

研究问题

RQ1投影梯度方法能否为在凸约束下最大化连续子模函数提供可证明的近似保证？
RQ2对于连续子模最大化问题，随机投影梯度上升的收敛速率是多少？
RQ3在弱子模函数中，近似质量如何依赖于子模性比率 $\gamma$？
RQ4基于 Bregman 散度的镜像下降在平滑度参数方面能否优于基于 $\ell_2$ 的梯度方法？
RQ5连续梯度方法在多线性扩展的连续松弛下，能在多大程度上近似求解离散子模优化问题？

主要发现

在凸集 $\mathcal{K}$ 上，单调、连续 DR-子模函数的所有不动点均能提供对全局最大值的 $1/2$ 近似。
使用小步长的投影梯度上升（梯度流）收敛至 $1/2$ 近似解。
随机投影梯度上升在 $\mathcal{O}(L_2/\epsilon + \sigma^2/\epsilon^2)$ 次迭代内可达到期望目标值至少为 $\text{OPT}/2 - \epsilon$。
投影镜像上升在 $\mathcal{O}(L_*/\epsilon + \sigma^2/\epsilon^2)$ 次迭代内实现相同的 $\text{OPT}/2 - \epsilon$ 保证，其中 $L_*$ 可显著小于 $L_2$。
对于子模性比率为 $\gamma$ 的弱 DR-子模函数，近似保证为 $\gamma^2/(1 + \gamma^2)$，从而推广了 $1/2$ 的结果。
该方法可通过多线性扩展的连续松弛，实现对离散单调子模优化的高效近似。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。