QUICK REVIEW

[论文解读] Learning Memory-guided Normality for Anomaly Detection

Hyunjong Park, Jongyoun Noh|arXiv (Cornell University)|Mar 30, 2020

Anomaly Detection Techniques and Applications参考文献 50被引用 46

一句话总结

一种无监督视频异常检测方法，使用记忆模块存储多种典型的正常模式，并通过特征紧凑性和分离性损失来学习，达到最先进的结果。它通过加权机制更新记忆，防止从异常中学习。

ABSTRACT

We address the problem of anomaly detection, that is, detecting anomalous events in a video sequence. Anomaly detection methods based on convolutional neural networks (CNNs) typically leverage proxy tasks, such as reconstructing input video frames, to learn models describing normality without seeing anomalous samples at training time, and quantify the extent of abnormalities using the reconstruction error at test time. The main drawbacks of these approaches are that they do not consider the diversity of normal patterns explicitly, and the powerful representation capacity of CNNs allows to reconstruct abnormal video frames. To address this problem, we present an unsupervised learning approach to anomaly detection that considers the diversity of normal patterns explicitly, while lessening the representation capacity of CNNs. To this end, we propose to use a memory module with a new update scheme where items in the memory record prototypical patterns of normal data. We also present novel feature compactness and separateness losses to train the memory, boosting the discriminative power of both memory items and deeply learned features from normal data. Experimental results on standard benchmarks demonstrate the effectiveness and efficiency of our approach, which outperforms the state of the art.

研究动机与目标

在没有异常训练数据的情况下解决视频中的异常检测问题。
使用原型特征的记忆显式建模正常模式的多样性。
通过记忆限制 CNN 的重构/预测能力，以聚焦于正常模式。
提出稳定的记忆更新规则，避免从异常中学习。
在标准基准上展示最先进的性能。

提出的方法

引入一个具有 M 个条目的记忆模块，每个条目记录一个原型正常模式。
使用编码器（U-Net）生成逐像素查询，并通过余弦相似度从记忆中读取以获得用于重构/预测的更新特征。
计算读取步骤，将 hat{p}_t^k 形成为记忆条目的加权和；与 q_t 拼接用于解码。
使用分配给每个条目的查询来更新记忆条目，v_t^k,m 指导加权更新；应用归一化以将更新聚焦于正常帧。
通过重构损失、特征紧凑损失（q_t^k 接近其最近记忆条目）以及特征分离损失（以一个边界将次近条目推开）来训练。
在测试时，计算一个加权常规分数 E_t，以防止异常帧的记忆更新，并将异常分数 S_t 作为基于 PSNR 的重构质量与基于记忆的差异的组合得出。

实验结果

研究问题

RQ1基于记忆的表示能否捕捉视频帧中正常模式的多样性，从而提升异常检测？
RQ2同时在记忆条目与查询之间强制紧凑性和分离性，是否会带来更具区分力的正常模式原型？
RQ3能否对记忆更新进行条件化，以在测试阶段避免吸收异常帧，同时保持性能？
RQ4在标准基准（Ped2、Avenue、ShanghaiTech）上，基于记忆的异常检测在 AUC 方面与最先进方法相比的表现如何？
RQ5基于重构的线索与基于记忆的线索在异常评分上的权衡是什么？

主要发现

方法	Ped2 [21]	Avenue [24]	Shanghai [26]
Ours-R w/o Mem.	86.4	80.6	65.8
Ours-R w/ Mem.	90.2	82.8	69.8
Frame-Pred (Pred.)	95.4	85.1	72.8
Ours-P w/o Mem.	94.3	84.5	66.8
Ours-P w/ Mem.	97.0	88.5	70.5

在使用记忆时，具有预测任务的记忆增强模型在 Ped2(97.0) 和 Avenue(88.5) 上实现了最优 AUC，超越基线。
记忆的使用在 Ped2、Avenue、ShanghaiTech 上均带来持续提升（例如 Ours-P w/ Mem. 分别达到 97.0、88.5、70.5）。
特征分离损失显著提升性能（例如消融显示在添加分离性时 AUC 提升 3.8 个百分点）。
仅用正常帧更新记忆（通过加权常规分数）可提升异常检测性能。
该方法运行速度很快（约 67 帧/秒），相较于基于 Flow 或对抗性方法，在准确性与运行时之间提供更优权衡。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。