QUICK REVIEW

[论文解读] Optimal Bounds for Private Minimum Spanning Trees via Input Perturbation

Rasmus Pagh, Lukas Retschmeier|arXiv (Cornell University)|Dec 13, 2024

Mobile Ad Hoc Networks被引用 1

一句话总结

本文提出了一种新颖的输入扰动框架，在边权差分隐私下实现了最小生成树（MST）的最优误差界。通过在运行任何非私有MST算法之前，向输入权重添加校准后的噪声，该方法确保了(𝜀,𝛿)-差分隐私，同时保持了底层MST算法的时间复杂度，并实现了Õ(𝑛³ᐟ²)的最优误差，解决了关于私有MST计算中效率与效用权衡的开放性问题。

ABSTRACT

We study the problem of privately releasing an approximate minimum spanning tree (MST). Given a graph $G = (V, E, \vec{W})$ where $V$ is a set of $n$ vertices, $E$ is a set of $m$ undirected edges, and $ \vec{W} \in \mathbb{R}^{|E|} $ is an edge-weight vector, our goal is to publish an approximate MST under edge-weight differential privacy, as introduced by Sealfon in PODS 2016, where $V$ and $E$ are considered public and the weight vector is private. Our neighboring relation is $\ell_\infty$-distance on weights: for a sensitivity parameter $\Delta_\infty$, graphs $ G = (V, E, \vec{W}) $ and $ G' = (V, E, \vec{W}') $ are neighboring if $\|\vec{W}-\vec{W}'\|_\infty \leq \Delta_\infty$. Existing private MST algorithms face a trade-off, sacrificing either computational efficiency or accuracy. We show that it is possible to get the best of both worlds: With a suitable random perturbation of the input that does not suffice to make the weight vector private, the result of any non-private MST algorithm will be private and achieves a state-of-the-art error guarantee. Furthermore, by establishing a connection to Private Top-k Selection [Steinke and Ullman, FOCS '17], we give the first privacy-utility trade-off lower bound for MST under approximate differential privacy, demonstrating that the error magnitude, $ ilde{O}(n^{3/2})$, is optimal up to logarithmic factors. That is, our approach matches the time complexity of any non-private MST algorithm and at the same time achieves optimal error. We complement our theoretical treatment with experiments that confirm the practicality of our approach.

研究动机与目标

为长期存在的开放性问题提供解答：是否存在一种差分隐私MST算法，既能实现线性时间复杂度，又能达到最优误差界。
弥合现有方法在私有MST计算中效率与准确率之间的差距——这些方法要么牺牲计算效率（如原地噪声添加），要么牺牲准确率（如输入隐私化）。
首次建立在ℓ∞-邻近关系下私有MST误差的渐近紧下界，证明所提方法的最优性。
证明输入扰动——此前被认为效果较差——在结合精细噪声校准与非私有MST算法时，可实现最优效用。

提出的方法

在运行任何非私有MST算法之前，对输入图中的每条边权重应用独立的拉普拉斯噪声，其尺度与1/𝜀成比例。
通过利用差分隐私的后处理不变性，对噪声进行校准，以确保在ℓ∞-敏感度下的(𝜀,𝛿)-差分隐私。
该框架与任何非私有MST算法兼容，包括具有期望线性时间复杂度的算法（如Karger–Klein–Tarjan或Chazelle算法）。
通过与私有top-k选择的联系，该方法实现了最优误差，证明了误差量级Õ(𝑛³ᐟ²)在信息论上是紧致的。
理论分析通过归约到私有top-k选择并结合打包论证，证明误差界在对数因子范围内是最优的。
实验评估表明，该方法的输出分布与最先进的原地方法（如PAMST）相当，且在密集图中优于输入隐私化方法。

实验结果

研究问题

RQ1是否存在一种差分隐私MST算法，既能实现线性时间复杂度，又能达到最优误差界？
RQ2在ℓ∞-邻近关系下，私有MST的Õ(𝑛³ᐟ²)误差界是否渐近最优？
RQ3输入扰动——此前被认为次优——能否在私有MST计算中实现最优效用？
RQ4在近似差分隐私下，私有MST的隐私-效用权衡如何？能否被紧密刻画？

主要发现

所提出的输入扰动框架在(𝜀,𝛿)-差分隐私下实现了Õ(1/𝜀 · 𝑛³ᐟ² · log𝑛 · √log(1/𝛿))的期望误差，与现有原地方法的最佳误差界一致。
该方法保持了任何底层非私有MST算法的时间复杂度，当使用此类算法时，可实现期望线性时间执行。
本文建立了误差下界Ω(1/𝜀 · 𝑛³ᐟ² · log𝑛)，证明所提误差界在对数因子范围内渐近最优。
该框架优于传统输入隐私化方法（如Sealfon 2016），后者在ℓ∞-邻近关系下误差达Õ(𝑛²)，尤其在密集图中表现更差。
实验结果表明，该方法的输出分布与PAMST（一种最先进的原地方法）高度一致，验证了其实际效用。
本工作通过证明在私有MST计算中，效率与准确率的“最佳结合”是可实现的，解决了一个开放性问题。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。