QUICK REVIEW

[論文レビュー] MAGIC: Detecting Advanced Persistent Threats via Masked Graph Representation Learning

Zian Jia, Yun Xiong|arXiv (Cornell University)|Oct 15, 2023

Anomaly Detection Techniques and Applications被引用数 15

ひとこと要約

MAGIC は自己教師ありの、マスクされたグラフ表現学習アプローチを提案し、 provenance graphs から APT を検出し、マルチグレナリティ検出を低オーバーヘッドで実現し、概念ドリフトに対する堅牢性を確保します。

ABSTRACT

Advance Persistent Threats (APTs), adopted by most delicate attackers, are becoming increasing common and pose great threat to various enterprises and institutions. Data provenance analysis on provenance graphs has emerged as a common approach in APT detection. However, previous works have exhibited several shortcomings: (1) requiring attack-containing data and a priori knowledge of APTs, (2) failing in extracting the rich contextual information buried within provenance graphs and (3) becoming impracticable due to their prohibitive computation overhead and memory consumption. In this paper, we introduce MAGIC, a novel and flexible self-supervised APT detection approach capable of performing multi-granularity detection under different level of supervision. MAGIC leverages masked graph representation learning to model benign system entities and behaviors, performing efficient deep feature extraction and structure abstraction on provenance graphs. By ferreting out anomalous system behaviors via outlier detection methods, MAGIC is able to perform both system entity level and batched log level APT detection. MAGIC is specially designed to handle concept drift with a model adaption mechanism and successfully applies to universal conditions and detection scenarios. We evaluate MAGIC on three widely-used datasets, including both real-world and simulated attacks. Evaluation results indicate that MAGIC achieves promising detection results in all scenarios and shows enormous advantage over state-of-the-art APT detection approaches in performance overhead.

研究の動機と目的

攻撃を含むデータの欠如と事前の APT 知識の不足に対処するため、自己教師あり検出手法を提案する。
provenance graphs の豊富な文脈情報をモデリングして検出精度を向上させつつ、偽陽性を削減する。
マルチグレナリティ APT 検出（バッチド・ログレベルとシステムエンティティレベル）の実現と、計算のスケーラブルな実行。
概念ドリフトを扱い、アナリストのフィードバックを組み込むための任意のモデル適応機構を提供する。
低オーバーヘッドで実世界データとシミュレートデータセットの実用性と有効性を示す。

提案手法

監査ログからノイズ除去とラベルベースの初期埋め込みを用いて provenance graphs を構築する。
グラフマスクド自己エンコーダ（エンコーダ-デコーダ）を用いて、マスク付き特徴再構成とサンプルベースの構造再構成を通じて良性行動の埋め込みを学習する。
自己教師ありの方法でグラフ表現モジュールを訓練するために、2 段階の訓練目的 L = L_fr + L_sr を適用する。
ノード（システムエンティティ）を表現し集約して、バッチド・ログレベル検出のためのノード埋め込みとシステム状態埋め込みを取得する。
埋め込みに対して k-d 木を用いた外れ値検知を行い、バッチドまたはエンティティレベルのタスクにおける良性と異常パターンを識別する。
任意でモデル適応機構を組み込んで概念ドリフトを処理し、アナリストのフィードバックとメモリのディスカウントを組み込む。

実験結果

リサーチクエスチョン

RQ1自己教師ありのマスク付きグラフ表現学習は、攻撃を含むデータなしで良性 provenance グラフを効果的にモデル化して APT 検出を行えるか？
RQ2検出精度と計算効率の最良のトレードオフを提供する粒度は、バッチドログレベル対システムエンティティレベルのどちらか。
RQ3現実データとシミュレートデータセットに対して、最先端の手法と比較して MAGIC は検出精度/再現率とオーバーヘッドの点でどのように性能を示すか？
RQ4モデル適応機構は概念ドリフトに対するロバスト性を高め、時間とともに偽陽性を減らすのに寄与するか？

主な発見

MAGIC は evaluated データセットに対してエンティティレベルの APT 検出で 97.26% precision と 99.91% recall を達成する。
MAGIC は最先端手法と比べてオーバーヘッドが最小で、ShadeWatcher の 51 倍の高速化など著しく高速である。
MAGIC は実世界データセット（DARPA E3）とシミュレートデータセット（StreamSpot, Unicorn Wget）で性能を維持する。
アプローチはマルチグラナリティ検出（バッチドログとエンティティレベル）を柔軟な監視レベルでサポートする。
任意のモデル適応機構は概念ドリフトの緩和とアナリストのフィードバックの取り込みを支援し、堅牢性を向上させる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。