QUICK REVIEW

[論文レビュー] Hardware Trojan Attacks on Neural Networks

Joseph Clements, Yingjie Lao|arXiv (Cornell University)|Jun 14, 2018

Adversarial Robustness in Machine Learning参考文献 31被引用数 64

ひとこと要約

この論文はニューラルネットワークに対するハードウェア・トロイの木馬攻撃を紹介し、NNハードウェアに悪意あるトロイを挿入するための枠組みを概説し、MNISTで7層 CNNの5番目隠れ層の約0.03%のニューロンに影響を及ぼすトロイによる covert targeted misclassification を実証する。

ABSTRACT

With the rising popularity of machine learning and the ever increasing demand for computational power, there is a growing need for hardware optimized implementations of neural networks and other machine learning models. As the technology evolves, it is also plausible that machine learning or artificial intelligence will soon become consumer electronic products and military equipment, in the form of well-trained models. Unfortunately, the modern fabless business model of manufacturing hardware, while economic, leads to deficiencies in security through the supply chain. In this paper, we illuminate these security issues by introducing hardware Trojan attacks on neural networks, expanding the current taxonomy of neural network security to incorporate attacks of this nature. To aid in this, we develop a novel framework for inserting malicious hardware Trojans in the implementation of a neural network classifier. We evaluate the capabilities of the adversary in this setting by implementing the attack algorithm on convolutional neural networks while controlling a variety of parameters available to the adversary. Our experimental results show that the proposed algorithm could effectively classify a selected input trigger as a specified class on the MNIST dataset by injecting hardware Trojans into $0.03\%$, on average, of neurons in the 5th hidden layer of arbitrary 7-layer convolutional neural networks, while undetectable under the test data. Finally, we discuss the potential defenses to protect neural networks against hardware Trojan attacks.

研究の動機と目的

ニューラルネットワーク実装におけるハードウェア・トロイのセキュリティ問題を動機づけ、形式化する。
NN分類器に悪意あるハードウェア・トロイを挿入する枠組みを開発する。
攻撃を畳み込みニューラルネットワークで実装して攻撃者の能力を評価する。
トロイがMNISTのターゲット分類を最小限のニューロン関与で誘導できる程度を定量化する。

提案手法

悪意あるハードウェア・トロイをニューラルネットワーク分類器に挿入する新規枠組みを作成する。
制御可能な敵パラメータを持つ畳み込みニューラルネットワーク上で攻撃アルゴリズムを実装する。
選択されたトリガーがMNISTの特定のクラスを引き起こすことを実証する。
7層CNNの5番目の隠れ層でトロイが約0.03%のニューロンに影響することを平均で示す。
評価データ下での検出可能性を評価し、防御オプションについて議論する。

実験結果

リサーチクエスチョン

RQ1ハードウェア・トロイをニューラルネットワークのハードウェアに埋め込んでも標準テストで検出されないか。
RQ2標的の誤分類を強制するには、ニューロンのどの分数と場所をどの程度侵害する必要があるか。
RQ3一般的なCNNアーキテクチャ（例: 7層CNN）に対するハードウェア・トロイ攻撃はMNISTでどれほど有効か。
RQ4NNハードウェア実装におけるハードウェア・トロイの脅威を緩和する防御策は何か。

主な発見

攻撃は選択された入力トリガーをMNISTの指定クラスとして分類できる。
任意の7層CNNの5番目隠れ層に0.03%のニューロンに挿入されたトロイは平均でターゲット誤分類を達成できる。
トロイ活動は評価で使用したテストデータ下で検出不能のまま。
論文はニューラルネットワークのハードウェア・トロイ攻撃に対する潜在的な防御アプローチについて論じている。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。