QUICK REVIEW

[论文解读] Sample Complexity of Sinkhorn divergences

Aude Genevay, Lénaïc Chizat|arXiv (Cornell University)|Oct 5, 2018

Probabilistic and Robust Engineering Design被引用 142

一句话总结

本文通过在 RKHS 中重新表述 Sinkhorn 散度，推导出一个新的样本复杂度界限；它给出一个与正则化常数相关的 1/√n 速率，连接了 OT 与 MMD。

ABSTRACT

Optimal transport (OT) and maximum mean discrepancies (MMD) are now routinely used in machine learning to compare probability measures. We focus in this paper on \emph{Sinkhorn divergences} (SDs), a regularized variant of OT distances which can interpolate, depending on the regularization strength $\varepsilon$, between OT ($\varepsilon=0$) and MMD ($\varepsilon=\infty$). Although the tradeoff induced by that regularization is now well understood computationally (OT, SDs and MMD require respectively $O(n^3\log n)$, $O(n^2)$ and $n^2$ operations given a sample size $n$), much less is known in terms of their \emph{sample complexity}, namely the gap between these quantities, when evaluated using finite samples \emph{vs.} their respective densities. Indeed, while the sample complexity of OT and MMD stand at two extremes, $1/n^{1/d}$ for OT in dimension $d$ and $1/\sqrt{n}$ for MMD, that for SDs has only been studied empirically. In this paper, we \emph{(i)} derive a bound on the approximation error made with SDs when approximating OT as a function of the regularizer $\varepsilon$, \emph{(ii)} prove that the optimizers of regularized OT are bounded in a Sobolev (RKHS) ball independent of the two measures and \emph{(iii)} provide the first sample complexity bound for SDs, obtained,by reformulating SDs as a maximization problem in a RKHS. We thus obtain a scaling in $1/\sqrt{n}$ (as in MMD), with a constant that depends however on $\varepsilon$, making the bridge between OT and MMD complete.

研究动机与目标

在高维场景中激发对正则化 OT 的样本复杂度研究。
推导一个关于熵正则化参数 ε 的函数的正则化 OT 与标准 OT 近似误差的界限。
表明 Sinkhorn 最优解落在与输入测度无关的 Sobolev (RKHS) 球内。
将 Sinkhorn 散度重新表述为一个基于 RKHS 的期望最大化问题。
提供首个明确的 Sinkhorn 散度样本复杂度界限，并将其与 MMD 与 OT 关联。

提出的方法

给出熵正则化 OT Wε 与真实 OT W 之间的界限，作为 ε 的函数。
证明 Sinkhorn 力场在 Sobolev（RKHS）空间中有界，且与边缘分布无关。
将 SD 重新表述为基于 RKHS 的期望最大化问题，以便采用核 SGD 方法。
应用基于 RKHS 的 PAC 学习（Bartlett–Mendelson 框架）以获得经验 SD 的 1/√n 收敛率。
推导 ε 相关常数及收敛的渐近性质，并给出集中性推论的推论。

实验结果

研究问题

RQ1熵正则化参数 ε 如何影响正则化 OT 与标准 OT 之间的近似误差 Wε - W？
RQ2Sinkhorn 力场是否可以在独立于边缘分布的 Sobolev/RKHS 球内有界，从而实现基于 RKHS 的优化方法？
RQ3当从有限样本估计时，Sinkhorn 散度的样本复杂度是多少？它如何随 n 与 ε 变化？
RQ4就统计效率而言，SD 如何在 OT (ε→0) 与 MMD (ε→∞) 之间插值？
RQ5在计算 SD 时，对核-SGD 与基于 RKHS 的优化的实际意义是什么？

主要发现

Wε(α,β) − W(α,β) ≤ 2ε d log(e^2 L D /(√d ε))，且当 ε→0 时渐近为 ~ 2ε d log(1/ε)。
Sinkhorn 力场 (u,v) 在 Sobolev 空间 Hs(R^d) 中统一有界，范数为 O(1+1/ε^{s−1})。
正则化 OT 问题的最优解落在与测度无关的 RKHS 球内，便于核方法优化。
经验 Sinkhorn 散度以 O(1/√n) 的速率收敛到总体值，常数在小 ε 时按 exp(κ/ε)/ε^{⌊d/2⌋} 增长；在大 ε 时对 ε 无关。
基于 PAC/RKHS 的分析给出经验 SD 误差的界: E|Wε(α,β) − Wε(α̂n,β̂n)| = O((e^{κ/ε}/√n)(1+1/ε^{⎣d/2⎦})).
推论包括集中界，给出经验误差的高概率控制。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。