QUICK REVIEW

[論文レビュー] Deep Structured Energy Based Models for Anomaly Detection

Shuangfei Zhai, Yu Cheng|arXiv (Cornell University)|May 25, 2016

Anomaly Detection Techniques and Applications参考文献 24被引用数 255

ひとこと要約

本論文は、全結合・再帰・畳み込みアーキテクチャから構築されたエネルギー基盤ネットワークを用いてデータ分布をモデル化する Deep Structured Energy Based Models (DSEBMs) を提案し、スコアマッチングを通じて効率的な異常検知を実現します。静的・逐次・空間データに対してエネルギー基準と再構成誤差基準を評価し、ベースラインと比較して競争力を有する、あるいはそれを上回る結果を示します。

ABSTRACT

In this paper, we attack the anomaly detection problem by directly modeling the data distribution with deep architectures. We propose deep structured energy based models (DSEBMs), where the energy function is the output of a deterministic deep neural network with structure. We develop novel model architectures to integrate EBMs with different types of data such as static data, sequential data, and spatial data, and apply appropriate model architectures to adapt to the data structure. Our training algorithm is built upon the recent development of score matching \cite{sm}, which connects an EBM with a regularized autoencoder, eliminating the need for complicated sampling method. Statistically sound decision criterion can be derived for anomaly detection purpose from the perspective of the energy landscape of the data distribution. We investigate two decision criteria for performing anomaly detection: the energy score and the reconstruction error. Extensive empirical studies on benchmark tasks demonstrate that our proposed model consistently matches or outperforms all the competing methods.

研究の動機と目的

深層エネルギーベースモデルを用いて異常検知をデータ分布のモデリングとして動機付ける。
EBMを静的、逐次、空間データ構造へ拡張する。
複雑なサンプリングを回避する訓練手順としてスコアマッチングを開発する。
エネルギー地形から現実的な異常検知基準を導出する：エネルギースコアと再構成誤差。

提案手法

エネルギーを、構造を持つ深層ニューラルネットワークの出力としてモデル化する（全結合、再発、畳み込み）。
スコアマッチングを用いてエネルギー関数を訓練し、MCMCサンプリングを必要とせずSGDベースの最適化を可能にする。
再構成関数 f(x;θ)=x−∇xE(x;θ) を導出し、EBMとデノイジングオートエンコーダの挙動を結びつける。
逐次データの場合、時刻ごとに p(x) を分解し、ステップ特異なエネルギーを導入してRNNがエネルギーパラメータを適応させる。
畳み込みEBMs では hL を CNN の出力に置換し、勾配を畳み込み層を通して伝搬させる。
二つの異常決定基準を提供：エネルギー閾値（E(x;θ) > Eth）と再構成誤差閾値（||∇xE(x;θ)||² > Errorth）。

実験結果

リサーチクエスチョン

RQ1静的・逐次・空間データ全体で、深いエネルギーベースモデルは異常検知のために複雑なデータ分布を捉えられるか？
RQ2スコアマッチングをどのように活用して、集中的なサンプリングを要さずに深いEBMsを効率的に訓練できるか？
RQ3エネルギー地形と再構成誤差から導出された効果的な異常判定基準は何か？
RQ4DSEBMsは静的・逐次・画像データセットのベースラインを上回るか？

主な発見

Method	KDD99 Precision	KDD99 Recall	KDD99 F1	Thyroid Precision	Thyroid Recall	Thyroid F1	Usenet Precision	Usenet Recall	Usenet F1
DSEBM-r	0.8521	0.6472	0.7328	0.9527	0.7479	0.8386	0.7205	0.7837	0.7314
DSEBM-e	0.8619	0.6446	0.7399	0.9558	0.7642	0.8375	0.7129	0.8081	0.7475

DSEBMs はエネルギーに基づくスコア付け（DSEBM-e）と再構成ベースのスコア付け（DSEBM-r）を用いて、静的データセット（KDD99, Thyroid, Usenet）で競合するか優れた性能を達成。
高次元の静的データでは、DSEBM-e がしばしば最高のF1スコアを出す（例：Usenet、KDD99）。
逐次データでは、DSEBM-e がCUAVE、NATOPS、FITNESS などのデータセットで平均精度とF1をほぼ最高に達成。
空間データ（Caltech-101、MNIST、CIFAR-10）では、DSEBM-e がリコールとF1 のトップを獲得し、MNIST/CIFAR-10 で顕著な向上を示す。
エネルギーに基づく決定基準は、ほとんどのベンチマークで再構成基準を上回り、エネルギー地形が頑健な異常指標として機能する。
再構成誤差基準は高次元で依然として意味を持ち、特に外れ値がエネルギー最大と一致しにくい場合に有効。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。