Skip to main content
QUICK REVIEW

[Paper Review] On the rate of convergence in Wasserstein distance of the empirical measure

Nicolas Fournier, Arnaud Guillin|arXiv (Cornell University)|Dec 7, 2013
Point processes and geometric inequalities43 references930 citations
TL;DR

This paper establishes non-asymptotic $L^p$-moment bounds and concentration inequalities for the convergence rate of the empirical measure to the true distribution in Wasserstein distance of order $p>0$. It derives sharp rates depending on dimension $d$, moment conditions of the distribution, and extends results to dependent sequences and particle systems, showing optimal rates under minimal moment assumptions.

ABSTRACT

Let $\\mu_N$ be the empirical measure associated to a $N$-sample of a given probability distribution $\\mu$ on $\\mathbb{R}^d$. We are interested in the rate of convergence of $\\mu_N$ to $\\mu$, when measured in the Wasserstein distance of order $p>0$. We provide some satisfying non-asymptotic $L^p$-bounds and concentration inequalities, for any values of $p>0$ and $d\\geq 1$. We extend also the non asymptotic $L^p$-bounds to stationary $\ ho$-mixing sequences, Markov chains, and to some interacting particle systems.

Motivation & Objective

  • To quantify the rate of convergence of the empirical measure $\mu_N$ to the true measure $\mu$ in Wasserstein distance $\mathcal{W}_p$ for $p>0$.
  • To derive non-asymptotic $L^p$-moment bounds and concentration inequalities valid for all $N \geq 1$, not just in the limit.
  • To extend the results beyond i.i.d. samples to $\rho$-mixing sequences, Markov chains, and McKean-Vlasov particle systems.
  • To identify the interplay between dimension $d$, moment conditions on $\mu$, and the convergence rate in $\mathcal{W}_p$.

Proposed method

  • Uses moment conditions $M_q(\mu) < \infty$ for $q > p$ and exponential moments $\mathcal{E}_{\alpha,\gamma}(\mu)$ to control the Wasserstein distance.
  • Applies techniques from Dereich, Scheutzow, and Schottstedt (2013) to derive sharp bounds on $\mathbb{E}[\mathcal{T}_p(\mu_N, \mu)]$.
  • Employs covariance decay estimates and Hölder inequalities to handle dependent processes such as Markov chains and $\rho$-mixing sequences.
  • Uses a dyadic partitioning and covering argument to control the Wasserstein distance via $L^p$-norms of empirical measures on small sets.
  • Applies the propagation of chaos framework to McKean-Vlasov particle systems, comparing the empirical measure of the particle system to the nonlinear SDE solution.
  • Combines moment bounds with known $L^2$-convergence rates of particle systems to derive overall convergence rates in $\mathcal{W}_2$.

Experimental results

Research questions

  • RQ1What is the non-asymptotic rate of convergence of the empirical measure $\mu_N$ to $\mu$ in $\mathcal{W}_p$ for $p>0$?
  • RQ2How do the convergence rates depend on the dimension $d$, the moment order $p$, and the tail behavior of $\mu$?
  • RQ3Can the $L^p$-moment bounds be extended to dependent processes such as $\rho$-mixing sequences and Markov chains?
  • RQ4What is the convergence rate of particle systems approximating McKean-Vlasov SDEs in $\mathcal{W}_2$?
  • RQ5How do the bounds compare to known lower bounds in specific cases like discrete or uniform distributions?

Key findings

  • For $p > d/2$, the rate is $O(N^{-1/2} + N^{-(q-p)/q})$ under $M_q(\mu) < \infty$ for $q > p$, with logarithmic corrections at $p = d/2$.
  • For $p < d/2$, the rate is $O(N^{-p/d} + N^{-(q-p)/q})$, matching the known lower bound for uniform measures on $[-1,1]^d$, which is $\Omega(N^{-p/d})$.
  • In the case $p = d/2 = 1$, the rate is $O(N^{-1/2}\log(1+N))$, consistent with the Ajtai-Komlós-Tusnády result for the uniform distribution.
  • For $\rho$-mixing sequences and ergodic Markov chains, the rate remains $O(N^{-1/2})$ under $L^r$-integrability of the initial distribution and geometric ergodicity.
  • For McKean-Vlasov particle systems, the overall rate in $\mathcal{W}_2$ is $O(\alpha(N) + \beta(N))$, where $\alpha(N) = N^{-1}$ (log-Sobolev case) or $N^{-1/(α-1)}$ (polynomial potential), and $\beta(N)$ is the i.i.d. rate depending on $d$.
  • The bounds are sharp: lower bounds of order $N^{-1/2}$ exist for any measure with separated atoms, and $N^{-p/d}$ for uniform measures, confirming the optimality of the derived rates.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.