QUICK REVIEW

[論文レビュー] Efficient Lipschitz Extensions for High-Dimensional Graph Statistics and Node Private Degree Distributions

Sofya Raskhodnikova, Adam Smith|arXiv (Cornell University)|Apr 29, 2015

Privacy-Preserving Technologies in Data参考文献 22被引用数 25

ひとこと要約

本稿では、ベクトル値のグラフ統計量（特にソート済み次数リストおよび次数分布）に対する計算効率の良いリプシッツ拡張を導入し、より正確なノード差分プライバシーを実現するアルゴリズムを提示する。一般化された指数機構を提案し、$\alpha$-減衰グラフにおける次数分布推定において、$ O\bigl(\bar{d}^{2\alpha/(\alpha+1)} / (\epsilon n)^{(\alpha-1)/(\alpha+1)}\bigr) $ の改善された誤差バウンドを達成する。これは先行研究を著しく上回る性能を示す。

ABSTRACT

Lipschitz extensions were recently proposed as a tool for designing node differentially private algorithms. However, efficiently computable Lipschitz extensions were known only for 1-dimensional functions (that is, functions that output a single real value). In this paper, we study efficiently computable Lipschitz extensions for multi-dimensional (that is, vector-valued) functions on graphs. We show that, unlike for 1-dimensional functions, Lipschitz extensions of higher-dimensional functions on graphs do not always exist, even with a non-unit stretch. We design Lipschitz extensions with small stretch for the sorted degree list and for the degree distribution of a graph. Crucially, our extensions are efficiently computable. We also develop new tools for employing Lipschitz extensions in the design of differentially private algorithms. Specifically, we generalize the exponential mechanism, a widely used tool in data privacy. The exponential mechanism is given a collection of score functions that map datasets to real values. It attempts to return the name of the function with nearly minimum value on the data set. Our generalized exponential mechanism provides better accuracy when the sensitivity of an optimal score function is much smaller than the maximum sensitivity of score functions. We use our Lipschitz extension and the generalized exponential mechanism to design a node-differentially private algorithm for releasing an approximation to the degree distribution of a graph. Our algorithm is much more accurate than algorithms from previous work.

研究の動機と目的

多変数のグラフ関数（例えばソート済み次数リストや次数分布など）に対する計算効率の良いリプシッツ拡張を設計すること。
スパースなグラフにおいて、特に高次元のグラフ統計量に対するノード差分プライバシーのアルゴリズムにおけるギャップを埋めること。
最適なスコア関数の感度が最大値に対して小さい場合に、精度を向上させる一般化された指数機構を開発すること。
次数分布推定における先行のノード差分プライバシー手法よりも著しく優れた誤差バウンドを達成すること。

提案手法

多項式時間アルゴリズムを用いて、ソート済み次数リストおよび次数分布に対する小さなストレッチを持つリプシッツ拡張を設計する。
感度の異なるスコア関数の間から選択する一般化された指数機構を導入し、感度が低い関数を優遇する。
滑らかさ感度とプライベート推定に基づく閾値選択アルゴリズムを用いて、最適な次数閾値 $ D $ を選ぶ。
リプシッツ拡張と一般化された指数機構を組み合わせ、ストレッチおよび閾値に比例してスケーリングされたラプラスノイズを追加する。
出力ヒストグラムを確率分布に正規化するために、グラフサイズ $ n $ のプライベート推定値を用いる。
$\alpha$-減衰仮定の下で誤差解析を実施し、タイトな $\ell_1$ 誤差バウンドを導出する。

実験結果

リサーチクエスチョン

RQ1非単位ストレッチでさえも、多変数のグラフ関数に対するリプシッツ拡張は常に存在するのか？
RQ2グラフ上でのソート済み次数リストおよび次数分布に対して、計算効率の良いリプシッツ拡張を構築できるか？
RQ3スコア関数の感度が著しく異なる場合、一般化された指数機構はどのようにして精度を向上させるのか？
RQ4ノード差分プライバシー下での次数分布推定において、近似誤差とプライバシーコストの最適なトレードオフは何か？
RQ5提案手法は、$ \alpha > 1 $ のスパースなグラフにおいて、次数分布推定で $ o(1) $ の誤差を達成できるか？

主な発見

多変数のグラフ関数に対するリプシッツ拡張は、非単位ストレッチでさえも、1次元の場合とは異なり、常に存在するとは限らない。
ストレッチが定数で有界である計算効率の良いリプシッツ拡張が、ソート済み次数リストおよび次数分布に対して構築された。
一般化された指数機構は、感度が低いスコア関数を優先することで精度を向上させ、特に最適関数の感度が最大感度に比べて著しく低い場合に顕著な効果を示す。
本手法は、$\alpha$-減衰グラフにおいて、$ \mathbb{E}\|\hat{p} - p_G\|_1 = O\bigl(\bar{d}^{2\alpha/(\alpha+1)} / (\epsilon n)^{(\alpha-1)/(\alpha+1)}\bigr) $ を達成する。
誤差バウンドは、$ \bar{d}^{2\alpha/(\alpha-1)} = o(\epsilon n) $ のとき、$ n \to \infty $ で $ o(1) $ となるため、漸近的整合性が保証される。
本手法は、特にパワー則に類似した次数分布を示すスパースなグラフにおいて、先行のノード差分プライバシー手法を著しく上回る性能を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。