QUICK REVIEW

[論文レビュー] Towards Measuring Membership Privacy

Yunhui Long, Vincent Bindschaedler|arXiv (Cornell University)|Dec 25, 2017

Privacy-Preserving Technologies in Data参考文献 30被引用数 63

ひとこと要約

本論文は、差分プライバシーが適用できない場合の分類器のメンバーシップ推定リスクを定量化する実証メトリクスとしてDifferential Training Privacy (DTP) を導入し、計算効率の高い下界としてPDTPを提案する。DTP/PDTP がメンバーシップ攻撃の成功を予測することを示し、DTP-1を公表ガイドラインとして用いることを提案する。

ABSTRACT

Machine learning models are increasingly made available to the masses through public query interfaces. Recent academic work has demonstrated that malicious users who can query such models are able to infer sensitive information about records within the training data. Differential privacy can thwart such attacks, but not all models can be readily trained to achieve this guarantee or to achieve it with acceptable utility loss. As a result, if a model is trained without differential privacy guarantee, little is known or can be said about the privacy risk of releasing it. In this work, we investigate and analyze membership attacks to understand why and how they succeed. Based on this understanding, we propose Differential Training Privacy (DTP), an empirical metric to estimate the privacy risk of publishing a classier when methods such as differential privacy cannot be applied. DTP is a measure of a classier with respect to its training dataset, and we show that calculating DTP is efficient in many practical cases. We empirically validate DTP using state-of-the-art machine learning models such as neural networks trained on real-world datasets. Our results show that DTP is highly predictive of the success of membership attacks and therefore reducing DTP also reduces the privacy risk. We advocate for DTP to be used as part of the decision-making process when considering publishing a classifier. To this end, we also suggest adopting the DTP-1 hypothesis: if a classifier has a DTP value above 1, it should not be published.

研究の動機と目的

差分プライバシー保証なしに公開クエリにさらされる分類器のプライバシーリスクを動機づけ、定量化する。
分類器およびデータセット固有の実証的なプライバシーメトリック（DTP）を開発し、メンバーシップの漏洩を測定する。
PDTPをDTPの計算効率の高い代理指標として導入し、それを直接的なメンバーシップ攻撃と結びつける。
実データセットと一般的なモデル上でDTP/PDTPを検証し、 MLaaS の公開判断を導く。
DTP-1仮説を実用的な公開閾値として提案する。

提案手法

DTPを、訓練データの1件を除外したときに予測がどのように変化するかの境界として定義・形式化する。
leave-one-out 評価を用いてDTPの計算効率の高い代理指標としてPDTPを提案する。
一般的なメンバーシップ攻撃フレームワーク（非標的攻撃、距離ベース、頻度ベース）とシャドーモデルベースの攻撃を構築してプライバシーを評価する。
実データセット（UCI Adult と NN-Purchase）と複数のモデル（NN、NB、LR）で予測ビニングを用いて測定を安定化させつつ評価する。
トレーニングの安定性を分析し、直接攻撃が間接攻撃を支配するのはいつかについて理論的知見を提供する。

実験結果

リサーチクエスチョン

RQ1DTP のような実証的な非DP指標を用いて、分類器のメンバーシッププライバシーリスクを定量化できるか？
RQ2PDTP は実際のメンバーシップ攻撃の成功と相関する、信頼できて効率的なDTPの下界か？
RQ3DTP-1閾値（DTP > 1 なら公表すべきでない）はデータセットやモデル間で成り立つか？
RQ4分類器の過学習や訓練の安定性はメンバーシップ推定攻撃への脆弱性にどう影響するか？
RQ5攻撃タイプ（非標的、距離ベース、頻度ベース）とPDTP/DTP指標との関係は何か？

主な発見

メンバーシップ攻撃	攻撃精度	攻撃の適合率	攻撃の再現率	攻撃のF1スコア	PDTPとの相関	p値
Untargeted Attack	0.6680	0.6386	0.8500	0.7294	0.4864	2.89×10⁻⁷
Frequency-Based Attack	0.6257	0.5933	0.8253	0.7174	0.5052	8.29×10⁻⁸
Distance-Based Attack	0.8533	0.8470	0.9087	0.8768	0.7653	1.85×10⁻²⁰

DTP値は実験全体でメンバーシップ攻撃の成功と強く相関する（例：NN-Purchase: 距離ベース攻撃で r = 0.7653；全体的に攻撃の相関が強い）。
DTP値が0.5未満では、攻撃はベースラインを超える精度でメンバーシップを推測できなかった；DTPが4を超えると、攻撃は頻繁に90％を超える精度を達成する。
PDTPはDTPの下界を提供し、leave-one-out評価を通じてメンバーシッププライバシーリスクの効率的な指標となる。
分析された3つの直接攻撃は性能にばらつきを示し、距離ベース攻撃が最高の精度を達成（例：0.8533）し、PDTPとの相関が最も強い。
本研究はDTP-1仮説を実用的なガイドラインとして支持する：DTPが1を超える分類器は公表すべきでない。
訓練の安定性を重要な要因として特定。Naive Bayes、ランダム決定木、線形統計クエリは訓練の安定性を満たす一方、k-NNは満たさない。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。