QUICK REVIEW

[論文レビュー] ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Yugeng Liu, Rui Wen|arXiv (Cornell University)|Feb 4, 2021

Adversarial Robustness in Machine Learning参考文献 67被引用数 46

ひとこと要約

ML-Doctor は、メンバーシップ推論、モデル反転、属性推定、モデル盗用のプライバシーリスクを、複数のアーキテクチャとデータセットに跨って評価する、DP-SGD と知識蒸留などの防御策を含む、全体論的かつモジュール式のフレームワークを提供します。

ABSTRACT

Inference attacks against Machine Learning (ML) models allow adversaries to learn sensitive information about training data, model parameters, etc. While researchers have studied, in depth, several kinds of attacks, they have done so in isolation. As a result, we lack a comprehensive picture of the risks caused by the attacks, e.g., the different scenarios they can be applied to, the common factors that influence their performance, the relationship among them, or the effectiveness of possible defenses. In this paper, we fill this gap by presenting a first-of-its-kind holistic risk assessment of different inference attacks against machine learning models. We concentrate on four attacks -- namely, membership inference, model inversion, attribute inference, and model stealing -- and establish a threat model taxonomy. Our extensive experimental evaluation, run on five model architectures and four image datasets, shows that the complexity of the training dataset plays an important role with respect to the attack's performance, while the effectiveness of model stealing and membership inference attacks are negatively correlated. We also show that defenses like DP-SGD and Knowledge Distillation can only mitigate some of the inference attacks. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models, and equally serves as a benchmark tool for researchers and practitioners.

研究の動機と目的

MLモデルに対する推論攻撃の脅威モデルの包括的な分類を提供する。
データセットの複雑さとモデルの過学習が攻撃性能に与える影響を定量化する。
アーキテクチャとデータセット全体で、異なる推論攻撃と防御の関係を探る。
研究者とモデル所有者のために、攻撃と防御をベンチマークする、モジュール式で再利用可能なフレームワーク（ML-Doctor）を提供する。

提案手法

2次元の脅威モデル分類を定義する（モデルアクセス: white-box/black-box; 支援データ: 部分的/シャドウ/なし）。
さまざまな脅威モデルの下で、4つの推論攻撃（メンバーシップ推論、モデル反転、属性推定、モデル盗用）を定式化する。
5つのモデルアーキテクチャと4つの画像データセットで広範な実証評価を実施し、攻撃性能と防御の有効性を分析する。
データ処理、攻撃、防御、評価モジュールを備えたモジュラーなフレームワークとして ML-Doctor を実装する。
シャドウモデルと補助データを用いて、メンバーシップ推論および関連攻撃の攻撃モデルを訓練する。
DP-SGD や Knowledge Distillation などの防御を攻撃間で評価し、防御の適用範囲と限界を明らかにする。

実験結果

リサーチクエスチョン

RQ1RQ1: データセットの複雑さはさまざまな攻撃にどのような影響を与えるか？
RQ2RQ2: 過学習はさまざまな攻撃にどのような影響を与えるか？
RQ3RQ3: 異なる攻撃間の関係は何か？

主な発見

CelebA	FMNIST	STL10	UTKFace
1.000 / 0.680	1.000 / 0.884	1.000 / 0.522	1.000 / 0.792
1.000 / 0.742	1.000 / 0.909	1.000 / 0.524	1.000 / 0.852
1.000 / 0.734	1.000 / 0.905	1.000 / 0.587	1.000 / 0.834
1.000 / 0.735	1.000 / 0.916	1.000 / 0.574	1.000 / 0.846
1.000 / 0.707	1.000 / 0.903	1.000 / 0.517	1.000 / 0.818

データセットの複雑さは、メンバーシップ推論、モデル反転、モデル盗用に強く影響する。メンバーシップ推論はより複雑なデータセットで有利になる一方、モデル盗用はしばしば逆効果である。
メンバーシップ推論の成功とモデル盗用の成功との間には負の相関がある（r = -0.821）、過学習の影響によって生じる。
ホワイトボックスアクセスは、アタック全般においてブラックボックスアクセスより強力な攻撃性能を示す。
DP-SGD はメンバーシップ推論を緩和できるが、モデル有用性への影響は限定的。Knowledge Distillation は役立つが、いくつかの攻撃では効果が劣る。
部分的な補助データは、評価設定全体でメンバーシップ推論、属性推定、またはモデル盗用の攻撃性能を大幅に改善しない。
モデル盗用は、過学習ダイナミクスのため、単純なデータセット（例: FMNIST）で複雑なデータセット（例: STL10）より高い同意を達成する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。