QUICK REVIEW

[论文解读] Graph-based Anomaly Detection and Description: A Survey

Leman Akoglu, Hanghang Tong|arXiv (Cornell University)|Apr 18, 2014

Anomaly Detection Techniques and Applications参考文献 158被引用 76

一句话总结

本综述提出了一种全面且结构化的基于图的异常检测与描述框架，按监督类型、图动态特性及属性丰富度对方法进行分类。强调异常归因以解释异常发生的原因，为最先进技术与实际应用（如欺诈检测、安全和医疗保健）提供统一视图。

ABSTRACT

Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured {\em graph} data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we provide a comprehensive exploration of both data mining and machine learning algorithms for these {\em detection} tasks. we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly {\em attribution} and highlight the major techniques that facilitate digging out the root cause, or the `why', of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.

研究动机与目标

提供对最先进基于图的异常检测技术的系统性、全面且结构化的概述。
提出一个统一框架，根据监督方式（无监督、(半)监督）、图动态特性（静态、动态）和属性类型（带属性、无属性）对方法进行分类。
强调异常归因的重要性——即解释检测到的异常的根本原因——以实现可操作的洞察与意义理解。
突出在金融、社交网络、网络安全和医疗保健等领域的实际应用。
识别在可扩展性、鲁棒性、评估方法以及多源图融合方面存在的开放性理论与实践挑战。

提出的方法

提出一个通用的算法框架，从三个维度对异常检测方法进行分类：监督类型、图类型（静态 vs. 动态）和数据类型（带属性图 vs. 无属性图）。
综述利用图中长程相关性、基于结构、拓扑和属性特征的检测技术，以识别异常。
强调异常归因方法，通过识别导致异常的子图、节点或边来解释‘为何’检测到异常。
整合可视化分析与基于规则的推理，以增强可解释性，并支持人机协同分析。
讨论实时与流式图异常检测方法，重点关注次线性与可扩展算法。
提出评估策略，如异常注入与定性分析，以应对真实世界数据中真实标签稀缺的问题。

实验结果

研究问题

RQ1如何基于监督方式、图动态特性及属性丰富度，系统性地对基于图的异常检测方法进行分类？
RQ2哪些最有效的技术能够将检测到的异常归因于特定子图或节点/边模式，以解释其根本原因？
RQ3基于图的异常检测方法如何在实时环境中处理动态与演化的网络？
RQ4当真实标签不可用或获取成本高昂时，评估图异常检测算法面临哪些关键挑战？
RQ5如何融合表示不同类型关系的多张图以提升异常检测性能？

主要发现

基于图的异常检测在捕捉数据中的长程相关性与依赖关系方面优于基于点的方法。
异常归因对于实现可操作的洞察至关重要，已有多种技术被开发用于识别导致异常的子图或模式。
可扩展且次线性的算法对于在流式或大规模动态图中实现实时检测至关重要。
对抗鲁棒性仍是重大开放挑战，因为大多数方法未考虑攻击者可能的主动规避行为。
异常检测方法的评估缺乏标准化，异常注入与定性分析虽为常用但不完善的替代方案。
融合多张图（如社交网络与通信网络）可提升检测效果，但有效的融合策略仍是开放的研究问题。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。