QUICK REVIEW

[论文解读] Density Estimation for Shift-Invariant Multidimensional Distributions

Anindya De, Philip M. Long|arXiv (Cornell University)|Nov 9, 2018

Machine Learning and Algorithms参考文献 46被引用 2

一句话总结

本文引入了一种新颖的平滑性条件——平移不变性（shift-invariance），用于多维密度估计，使得具有跳跃间断点的分布能够被高效学习。该文提出了一种高效算法，可在 Õd(1/ε^{d+2}) 个样本和 Õd(1/ε^{2d+2}) 的时间内学习 d 维平移不变分布，实现总变差误差 ε，且该结果在 Huber 的污染模型下仍保持 O(ε) 的误差。结果近乎紧致，与信息论下界 Ω(1/ε^d) 相符。

ABSTRACT

We study density estimation for classes of shift-invariant distributions over R^d. A multidimensional distribution is "shift-invariant" if, roughly speaking, it is close in total variation distance to a small shift of it in any direction. Shift-invariance relaxes smoothness assumptions commonly used in non-parametric density estimation to allow jump discontinuities. The different classes of distributions that we consider correspond to different rates of tail decay. For each such class we give an efficient algorithm that learns any distribution in the class from independent samples with respect to total variation distance. As a special case of our general result, we show that d-dimensional shift-invariant distributions which satisfy an exponential tail bound can be learned to total variation distance error epsilon using O~_d(1/ epsilon^{d+2}) examples and O~_d(1/ epsilon^{2d+2}) time. This implies that, for constant d, multivariate log-concave distributions can be learned in O~_d(1/epsilon^{2d+2}) time using O~_d(1/epsilon^{d+2}) samples, answering a question of [Diakonikolas et al., 2016]. All of our results extend to a model of noise-tolerant density estimation using Huber's contamination model, in which the target distribution to be learned is a (1-epsilon,epsilon) mixture of some unknown distribution in the class with some other arbitrary and unknown distribution, and the learning algorithm must output a hypothesis distribution with total variation distance error O(epsilon) from the target distribution. We show that our general results are close to best possible by proving a simple Omega (1/epsilon^d) information-theoretic lower bound on sample complexity even for learning bounded distributions that are shift-invariant.

研究动机与目标

提出一种新的平滑性条件——平移不变性，其推广了 Sobolev 和 Besov 空间，并允许多维密度中存在跳跃间断点。
设计一种高效的学习算法，用于平移不变分布且具有受控的尾部衰减，从而实现超越平滑参数模型的实际密度估计。
在总变差距离下，建立学习此类分布的紧致样本与时间复杂度界。
将该框架扩展至 Huber 污染模型下的噪声容错学习，其中目标为类中分布与任意异常值分布的混合。
通过证明信息论下界 Ω(1/ε^d) 的样本复杂度，证明所提界近乎最优。

提出的方法

引入一个定量的平移不变性度量 SI(f, v, κ)，用于捕捉在尺度 κ 下沿方向 v 的小位移对密度 f 的平均变化率。
定义 d 维密度的类 CSI(c, d, g)，其满足平移不变性（对所有 κ > 0，有 SI(f, κ) ≤ c）且尾部衰减受非增函数 g 控制。
通过核平滑方法对经验分布进行构造，利用平移不变性控制偏差与方差，从而构建假设分布。
将定义域离散化为单位立方体，并构造一类分段常数密度，以推导样本复杂度的下界。
应用信息论工具，包括 Kullback-Leibler 散度与总变差距离，通过在精心构造的密度族上使用打包论证，推导出下界。
通过证明即使数据为 (1−ε, ε)-混合（即目标为类中分布与任意异常值分布的混合），同一算法仍能实现 O(ε) 的误差，将框架扩展至 Huber 的污染模型。

实验结果

研究问题

RQ1平移不变性能否作为平滑性条件，使得具有跳跃间断点的分布能够被高效密度估计，同时仍能捕捉轻尾行为？
RQ2学习 d 维平移不变分布且具有指数尾部衰减时，最优样本与时间复杂度为何？
RQ3该学习框架能否扩展至容忍数据中对抗性污染的情形，如 Huber 的污染模型？
RQ4所提样本复杂度与该类分布的信息论极限有多接近？
RQ5平移不变性条件是否足够一般，能够包含如各向同性对数凹分布和多变量正态分布等重要分布？

主要发现

本文证明，d 维平移不变分布若具有指数尾部衰减，可在 Õd(1/ε^{d+2}) 个样本和 Õd(1/ε^{2d+2}) 的时间内以总变差距离误差 ε 学习，当 d 为常数时该方法高效。
作为特例，多变量对数凹分布可在 Õd(1/ε^{2d+2}) 时间内以 Õd(1/ε^{d+2}) 个样本学习，解决了 DKS16b 中提出的一个开放问题。
该框架对 Huber 污染模型具有鲁棒性：即使目标为类中分布与任意异常值分布的 (1−ε, ε)-混合，该算法仍能实现 O(ε) 的总变差误差。
所提样本复杂度近乎最优，因本文证明了学习有界平移不变分布的样本复杂度下界为 Ω(1/ε^d)。
类 CSI(c, d, g) 足够广泛，可包含各向同性对数凹分布与多变量正态分布等关键分布，同时仍可实现高效学习。
下界构造使用了在离散化域上的分段常数密度族，其两两之间的总变差距离为 Ω(ε)，KL 散度为 O(1)，从而建立了 Ω((1/ε)^d) 的样本复杂度下界。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。