QUICK REVIEW

[论文解读] On the Complexity of Approximating Wasserstein Barycenter

Alexey Kroshnin, Darina Dvinskikh|arXiv (Cornell University)|Jan 24, 2019

Topological and Geometric Data Analysis参考文献 52被引用 32

一句话总结

本文分析了使用两种基于熵正则化的算法——迭代Bregman投影（IBP）和加速对偶梯度下降——近似m个离散概率测度的Wasserstein中位数的计算复杂度。研究发现，IBP在ε精度下需要O(mn²/ε²)次操作，而加速梯度下降可将复杂度降低至O(mn².⁵/ε)，两种方法的正则化参数均需与ε成比例，导致在高精度下出现不稳定性。本文进一步提出一种近端-IBP算法以缓解该问题，并分析了在集中式与去中心化分布式环境下的可扩展性。

ABSTRACT

We study the complexity of approximating Wassertein barycenter of $m$ discrete measures, or histograms of size $n$ by contrasting two alternative approaches, both using entropic regularization. The first approach is based on the Iterative Bregman Projections (IBP) algorithm for which our novel analysis gives a complexity bound proportional to $\frac{mn^2}{\varepsilon^2}$ to approximate the original non-regularized barycenter. Using an alternative accelerated-gradient-descent-based approach, we obtain a complexity proportional to $\frac{mn^{2.5}}{\varepsilon} $. As a byproduct, we show that the regularization parameter in both approaches has to be proportional to $\varepsilon$, which causes instability of both algorithms when the desired accuracy is high. To overcome this issue, we propose a novel proximal-IBP algorithm, which can be seen as a proximal gradient method, which uses IBP on each iteration to make a proximal step. We also consider the question of scalability of these algorithms using approaches from distributed optimization and show that the first algorithm can be implemented in a centralized distributed setting (master/slave), while the second one is amenable to a more general decentralized distributed setting with an arbitrary network topology.

研究动机与目标

分析近似m个大小为n的离散概率测度的Wasserstein中位数的计算复杂度。
比较两种基于熵正则化的算法：迭代Bregman投影（IBP）和加速对偶梯度下降。
确定正则化参数γ的最优选择，以实现对非正则化中位数的ε精度近似。
研究两种算法在集中式与去中心化分布式计算环境中的可扩展性。
提出一种新型近端-IBP算法，通过缓解小正则化参数带来的不利影响，提升数值稳定性。

提出的方法

分析IBP算法求解正则化Wasserstein中位数问题，收敛速率分析表明在ε精度下复杂度为O(mn²/ε²)。
提出一种加速对偶梯度下降方法，实现O(mn².⁵/ε)的复杂度，相比IBP在指数项上提升了ε⁻¹倍。
证明正则化参数γ需与ε成比例以确保ε精度，但该选择会导致高精度下数值不稳定性。
引入近端-IBP算法，将IBP作为近端梯度框架中的近端步，以稳定收敛过程。
分析两种方法在分布式环境中的可扩展性：IBP适用于集中式（主从）架构，而加速方法支持任意网络拓扑。
采用图稀疏化技术以控制条件数χ(W)和通信矩阵中的非零元素数量，从而实现高效的分布式计算。

实验结果

研究问题

RQ1使用基于熵正则化的迭代Bregman投影（IBP）算法近似Wasserstein中位数时，其计算复杂度是多少？
RQ2加速梯度下降方法能否在相同问题上实现优于IBP的复杂度界？
RQ3如何选择熵正则化参数γ，以确保对非正则化中位数的ε精度近似？
RQ4这些算法在分布式计算环境中的可扩展特性如何？
RQ5IBP的近端变体能否在γ需取小时（以实现高精度近似）提升数值稳定性？

主要发现

IBP算法在O(mn²/ε²)次操作内达到ε精度，其复杂度主要由每次迭代中O(n²)的梯度计算主导。
加速对偶梯度下降方法将复杂度降低至O(mn².⁵/ε)，在ε依赖性方面相比IBP有显著改进。
为实现ε精度，正则化参数γ必须与ε成比例，这在高精度下导致两种算法均出现数值不稳定性。
所提出的近端-IBP算法通过将IBP重新表述为近端步，稳定了解的收敛过程，缓解了小γ带来的不稳定性。
IBP方法适用于集中式分布式环境，通信轮数为O(1/ε²)；而加速方法支持去中心化网络，通信轮数为O(√n/ε)。
通过图稀疏化，通信矩阵可被压缩，使得χ(W) = O(Poly(ln m))且nnz(W) = O(m·Poly(ln m))，从而支持高效的分布式实现。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。