QUICK REVIEW

[论文解读] On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms

Tianyi Lin, Nhat Ho|arXiv (Cornell University)|Jan 19, 2019

Stochastic Gradient Optimization Techniques参考文献 36被引用 22

一句话总结

本文将正则化最优传输的Greenkhorn算法的理论复杂度从 $\widetilde{\mathcal{O}}(n^2\varepsilon^{-3})$ 提升至 $\widetilde{\mathcal{O}}(n^2\varepsilon^{-2})$，与Sinkhorn算法的最佳已知界一致。此外，本文提出一种自适应原始-对偶加速镜像下降（APDAMD）算法，其复杂度为 $\widetilde{\mathcal{O}}(n^2\sqrt{\delta}\varepsilon^{-1})$，其中 $\delta$ 为Bregman散度强凸性模量的倒数，实验结果表明该算法具有更优的性能和鲁棒性。

ABSTRACT

We provide theoretical analyses for two algorithms that solve the regularized optimal transport (OT) problem between two discrete probability measures with at most $n$ atoms. We show that a greedy variant of the classical Sinkhorn algorithm, known as the \emph{Greenkhorn algorithm}, can be improved to $\widetilde{\mathcal{O}}(n^2\varepsilon^{-2})$, improving on the best known complexity bound of $\widetilde{\mathcal{O}}(n^2\varepsilon^{-3})$. Notably, this matches the best known complexity bound for the Sinkhorn algorithm and helps explain why the Greenkhorn algorithm can outperform the Sinkhorn algorithm in practice. Our proof technique, which is based on a primal-dual formulation and a novel upper bound for the dual solution, also leads to a new class of algorithms that we refer to as \emph{adaptive primal-dual accelerated mirror descent} (APDAMD) algorithms. We prove that the complexity of these algorithms is $\widetilde{\mathcal{O}}(n^2\sqrtδ\varepsilon^{-1})$, where $δ> 0$ refers to the inverse of the strong convexity module of Bregman divergence with respect to $\|\cdot\|_\infty$. This implies that the APDAMD algorithm is faster than the Sinkhorn and Greenkhorn algorithms in terms of $\varepsilon$. Experimental results on synthetic and real datasets demonstrate the favorable performance of the Greenkhorn and APDAMD algorithms in practice.

研究动机与目标

为Greenkhorn算法（正则化最优传输的贪婪变体）提供更紧致的理论复杂度分析。
开发一类新型算法——自适应原始-对偶加速镜像下降（APDAMD），以实现更快的收敛速率。
解决先前自适应原始-对偶梯度方法（特别是APDAGD）复杂度界中的不一致性。
通过理论分析与实验验证，解释Greenkhorn与APDAMD在实践中优于Sinkhorn与APDAGD的原因。

提出的方法

分析采用原始-对偶公式化，并引入对偶最优解的新型上界（以 $\|\cdot\|_\infty$ 范数表示），以分析Greenkhorn的收敛性。
通过将镜像下降方法适配到正则化OT问题，利用强凸性参数为 $\delta^{-1}$ 的Bregman散度，推导出APDAMD算法，其中 $\delta$ 为强凸性模量的倒数。
采用基于 $\|\cdot\|_\infty$ 范数的线搜索策略以稳定APDAMD算法，提升其实际鲁棒性。
通过凸优化技术推导理论复杂度界，特别利用对偶问题的结构及Bregman散度的性质。
纠正了先前对APDAGD算法复杂度界声称的错误，表明其在一般情况下不成立，并推导出修正后的界为 $\widetilde{\mathcal{O}}(n^{5/2}\varepsilon^{-1})$。
在合成数据集和MNIST数据集上进行实验验证，比较Greenkhorn、APDAMD、APDAGD与GCPB算法的迭代次数、运行时间及鲁棒性。

实验结果

研究问题

RQ1Greenkhorn算法的理论复杂度能否超越先前已知的 $\widetilde{\mathcal{O}}(n^2\varepsilon^{-3})$ 界？
RQ2对于正则化最优传输问题，原始-对偶加速镜像下降方法可达到的最优收敛速率是什么？
RQ3为何Greenkhorn算法在实践中优于Sinkhorn算法？这一现象能否通过理论分析解释？
RQ4先前声称的APDAGD算法复杂度界是否有效？若无效，正确的界是什么？
RQ5Bregman散度的选择及强凸性参数 $\delta$ 如何影响APDAMD算法的收敛速率？

主要发现

Greenkhorn算法实现了 $\widetilde{\mathcal{O}}(n^2\varepsilon^{-2})$ 的复杂度界，与Sinkhorn算法的最佳已知界一致，解决了其经验性能与理论理解之间差距的问题。
APDAMD算法实现了 $\widetilde{\mathcal{O}}(n^2\sqrt{\delta}\varepsilon^{-1})$ 的复杂度界，当 $\delta$ 较小时，其在 $\varepsilon$ 方面的收敛速度优于Sinkhorn与Greenkhorn。
本文识别出先前对APDAGD算法复杂度界声称中的缺陷，并确立了修正后的界为 $\widetilde{\mathcal{O}}(n^{5/2}\varepsilon^{-1})$。
在合成数据集与MNIST数据集上的实验结果表明，APDAMD算法在收敛稳定性方面显著优于APDAGD与GCPB，且收敛更快。
当 $\delta = n$ 且采用二次Bregman散度时，APDAMD算法的复杂度与APDAGD相当，但因采用基于 $\|\cdot\|_\infty$ 的线搜索策略，表现出更高的鲁棒性。
理论分析证实，Greenkhorn复杂度的改进源于一种新型对偶解上界，该上界使得尽管采用贪婪的单行/单列更新策略，仍能实现更紧致的每步进展量化。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。