QUICK REVIEW

[論文レビュー] Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

Ahmed Salem, Apratim Bhattacharya|arXiv (Cornell University)|Apr 1, 2019

Adversarial Robustness in Machine Learning被引用数 89

ひとこと要約

本論文は、ブラックボックスモデルの更新前後の出力差が更新データに関する情報を漏らす可能性があることを示し、更新集合を推測・再構成する4つのエンコーダ-デコーダ攻撃を導入します。

ABSTRACT

Machine learning (ML) has progressed rapidly during the past decade and the major factor that drives such development is the unprecedented large-scale data. As data generation is a continuous process, this leads to ML model owners updating their models frequently with newly-collected data in an online learning scenario. In consequence, if an ML model is queried with the same set of data samples at two different points in time, it will provide different results. In this paper, we investigate whether the change in the output of a black-box ML model before and after being updated can leak information of the dataset used to perform the update, namely the updating set. This constitutes a new attack surface against black-box ML models and such information leakage may compromise the intellectual property and data privacy of the ML model owner. We propose four attacks following an encoder-decoder formulation, which allows inferring diverse information of the updating set. Our new attacks are facilitated by state-of-the-art deep learning techniques. In particular, we propose a hybrid generative model (CBM-GAN) that is based on generative adversarial networks (GANs) but includes a reconstructive loss that allows reconstructing accurate samples. Our experiments show that the proposed attacks achieve strong performance.

研究の動機と目的

ブラックボックスアクセス下のオンライン学習における更新セット漏洩リスクを動機づけ、形式化する。
事後差分から特性を推測したり更新データを再構成したりする4つの攻撃を提案する。
さまざまな更新セット情報のために事後差分を活用するエンコーダ-デコーダアーキテクチャを開発する。

提案手法

事後差分を入力として用いる一般的なエンコーダ-デコーダ攻撃パイプラインを定式化する。
攻撃の訓練用真値データを生成するためにシャドーモデルアプローチを用いる。
単一サンプル攻撃にはラベル推定とサンプル再構成を含む。
複数サンプル攻撃にはラベル分布推定と更新セットの再構成を含む。
条件付きベスト・オブ・メニー GAN である CBM-GAN を導入し、複数の更新サンプルを再構成する。
MNIST、CIFAR-10、Insta-NY に対して、100 サンプルの探査セットを用いて攻撃を評価する。

実験結果

リサーチクエスチョン

RQ1更新後のターゲットモデルの出力の差分は、更新セットに関する情報を漏らすことができますか？
RQ2ブラックボックス攻撃者はエンコーダ-デコーダ構成を用いて、更新セットのラベルを推測したりデータを再構成したりする能力はどの程度ですか？
RQ3単一サンプルと複数サンプルの更新セットにおける漏洩の程度はどの程度ですか？
RQ4シャドーモデルはブラックボックス制約下で攻撃モデルの現実的な訓練を可能にしますか？
RQ5高度な生成モデルは事後差分から更新セットをどの程度うまく再構成できますか？

主な発見

単一サンプルのラベル推定攻撃は、Insta-NYで0.97、CIFAR-10で0.96、MNISTで0.68の精度を達成。
単一サンプルの再構成攻撃はランダムベースラインを上回り、MNIST/CIFAR-10でオートエンコーダの性能に近づく。
複数サンプルのラベル分布推定攻撃は KL ダイバージェンスを低減し、データセット全体でランダムベースラインより精度を向上。
CBM-GAN は事後差分に条件付けられた更新セットの複数サンプル生成を可能にし、MNIST、CIFAR-10、Insta-NYでベースラインを上回る。
シャドーモデル訓練と 100 サンプル更新セットの探査下でも攻撃は有効のままであり、移転の緩和もいくつか検討されている。
本フレームワークは、モデル出力の差分が更新セット情報を実質的に漏らすことができることを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。