QUICK REVIEW

[論文レビュー] Stealing Machine Learning Models via Prediction APIs

Florian Tramèr, Fan Zhang|arXiv (Cornell University)|Sep 9, 2016

Adversarial Robustness in Machine Learning被引用数 733

ひとこと要約

本論文は、予測APIを介して提供されるMLモデルに対する実用的なモデル抽出攻撃を実証し、信頼度値を含む出力や不完全な入力を用いて、ターゲットモデル（ロジスティック回帰、ニューラルネットワーク、決定木を含む）をほぼ完全に回復することを示し、対策について論じる。

ABSTRACT

Machine learning (ML) models may be deemed confidential due to their sensitive training data, commercial value, or use in security applications. Increasingly often, confidential ML models are being deployed with publicly accessible query interfaces. ML-as-a-service ("predictive analytics") systems are an example: Some allow users to train models on potentially sensitive data and charge others for access on a pay-per-query basis. The tension between model confidentiality and public access motivates our investigation of model extraction attacks. In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i.e., "steal") the model. Unlike in classical learning theory settings, ML-as-a-service offerings may accept partial feature vectors as inputs and include confidence values with predictions. Given these practices, we show simple, efficient attacks that extract target ML models with near-perfect fidelity for popular model classes including logistic regression, neural networks, and decision trees. We demonstrate these attacks against the online services of BigML and Amazon Machine Learning. We further show that the natural countermeasure of omitting confidence values from model outputs still admits potentially harmful model extraction attacks. Our results highlight the need for careful ML model deployment and new model extraction countermeasures.

研究の動機と目的

MLaaSコンテキストにおいて予測APIを介して公開された機密MLモデルのリスクを動機づけ、 formalizeする。
一般的なモデルクラス（ロジスティック回帰、ニューラルネットワーク、決定木）にわたる実用的な抽出攻撃を実証する。
実サービスでの攻撃効率を定量化し、出力をクラスラベルのみとする等の対策を特定する。
トレーニングデータと回避の観点から、モデル抽出のプライバシーとセキュリティへの影響を強調する。

提案手法

テストと一様誤差指標を用いて抽出モデルがターゲットにどれだけ近いかを定量化するブラックボックスモデル抽出フレームワークを定義する。
信頼度付き出力と非適応・バッチ型クエリを用いてロジスティックモデルのパラメータを回収する方程式解法攻撃を示す。
信頼度値を識別子として利用し決定木を再構築する経路探索攻撃を開発する。
実在の prediction API と公開データセットを用いてMLサービス（AmazonとBigML）に対する攻撃を評価する。
データ漏えいとモデル再構築能力を示すため、マルチクラスロジスティック回帰、ニューラルネットワーク、カーネルロジスティック回帰へと攻撃を拡張する。

実験結果

リサーチクエスチョン

RQ1ブラックボックスアクセスのみを持つML予測APIが予測と信頼度スコアを返す場合、等価なまたは正確なモデルを回復できるか。
RQ2信頼度値と不完全なクエリは、LR、SVM、ニューラルネットワーク、決定木などの一般的なモデルクラスにおける効果的なモデル抽出を可能にするか。
RQ3現在のMLaaS提供者（AmazonやBigML）に対する実践的な影響と限界は何か。
RQ4クラスラベルのみを出力する等の対策はどこまで脆弱で、追加の保護は何が必要か。
RQ5モデル抽出はトレーニングデータ情報を漏らす可能性があるか、またどのレジーム（例：カーネルロジスティック回帰）でこの漏えいが顕著になるか。

主な発見

Service	Model Type	Data set	Queries	Time (s)
Amazon	Logistic Regression	Digits	650	70
Amazon	Logistic Regression	Adult	1,485	149
BigML	Decision Tree	German Credit	1,150	631
BigML	Decision Tree	Steak Survey	4,013	2,088

方程式解法攻撃は、非適応・バッチクエリを用いて、二値および多クラスのロジスティック回帰とニューラルネットワークのパラメータを回収できる。
多クラスLRとMLPでは、未知パラメータ数（k）とほぼ同等のクエリ数が必要で、抽出はほぼ完璧である（R_testとR_unifはほぼ0に近い）。
信頼度値を準識別子として扱うことで決定木を発見することで、いくつかのターゲットに対して実用的な厳密学習を可能にする。
実験では、テーブル化された結果としてサービスに対する迅速な抽出が示される：例として、Digits上のAmazon Logistic Regressionは650クエリ、70秒；Adult上のAmazon Logistic Regressionは1,485クエリ、149秒；German Credit上のBigML Decision Treeは1,150クエリ、631秒；Steak Survey上のBigML Decision Treeは4,013クエリ、2,088秒。
信頼度出力を省略しても、適応攻撃はさまざまなモデルで入力空間に対して>99%の精度を達成できるが、いくつかのケースではより多くのクエリ（最大約100倍）を要する。
カーネルロジスティック回帰は、回収された表現子を通じてトレーニングデータを漏らす可能性があり、センサのようなトレーニングデータ漏洩を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。