QUICK REVIEW

[论文解读] Soft + Hardwired Attention: An LSTM Framework for Human Trajectory Prediction and Abnormal Event Detection

Tharindu Fernando, Simon Denman|arXiv (Cornell University)|Feb 18, 2017

Video Surveillance and Tracking Methods参考文献 11被引用 19

一句话总结

该论文提出了一种基于LSTM的新型框架，通过结合可学习的软注意力机制与手工设计的注意力机制，实现对监控视频中人类轨迹的预测与异常事件检测。通过将可学习的软注意力与手工设计的空间注意力权重相结合，该模型在密集复杂环境中提升了轨迹预测的准确性，并实现了无需手工特征工程的端到端异常检测，在两个公开数据集上超越了当前最先进方法的表现。

ABSTRACT

As humans we possess an intuitive ability for navigation which we master through years of practice; however existing approaches to model this trait for diverse tasks including monitoring pedestrian flow and detecting abnormal events have been limited by using a variety of hand-crafted features. Recent research in the area of deep-learning has demonstrated the power of learning features directly from the data; and related research in recurrent neural networks has shown exemplary results in sequence-to-sequence problems such as neural machine translation and neural image caption generation. Motivated by these approaches, we propose a novel method to predict the future motion of a pedestrian given a short history of their, and their neighbours, past behaviour. The novelty of the proposed method is the combined attention model which utilises both "soft attention" as well as "hard-wired" attention in order to map the trajectory information from the local neighbourhood to the future positions of the pedestrian of interest. We illustrate how a simple approximation of attention weights (i.e hard-wired) can be merged together with soft attention weights in order to make our model applicable for challenging real world scenarios with hundreds of neighbours. The navigational capability of the proposed method is tested on two challenging publicly available surveillance databases where our model outperforms the current-state-of-the-art methods. Additionally, we illustrate how the proposed architecture can be directly applied for the task of abnormal event detection without handcrafting the features.

研究动机与目标

为解决在高密度观察者密集的复杂环境中准确预测行人轨迹的挑战。
通过结合可学习的软注意力（学习得到）与手工设计的注意力（空间结构化）来建模邻近行人影响，从而提升轨迹预测性能。
通过利用LSTM隐藏状态实现端到端的异常事件检测，无需依赖人工设计的特征。
在具有多样化人群动态的真实监控数据集上，展示模型的鲁棒性与泛化能力。

提出的方法

采用编码器-解码器LSTM架构，对随时间演化的行人轨迹进行建模。
应用软注意力机制，利用学习得到的注意力函数编码目标行人的自身轨迹。
引入手工设计的注意力权重，基于空间距离与相对位置建模邻近行人的影响。
将软注意力与手工设计注意力的上下文向量融合，生成统一的表征用于未来轨迹预测。
利用LSTM编码器与解码器的隐藏状态，通过DBSCAN聚类实现基于异常事件检测。
在观测轨迹上进行端到端训练，以预测未来路径并检测与正常行为的偏离。

实验结果

研究问题

RQ1结合软注意力与手工设计注意力的混合注意力机制，是否能显著提升在密集复杂人群场景下的轨迹预测性能？
RQ2该模型在具有高行人密度与动态交互的现实监控数据上，其泛化能力如何？
RQ3在不使用手工特征的情况下，LSTM隐藏状态在多大程度上可用于检测异常行为？
RQ4在轨迹预测准确率与异常检测性能方面，该模型与当前最先进方法相比表现如何？

主要发现

该模型在两个公开的监控数据集上实现了最先进性能，轨迹预测准确率超越了现有方法。
混合注意力机制在拥有数百名邻近行人的场景中显著提升了性能，展现出向真实世界密集环境扩展的潜力。
模型检测到了55个真实异常事件中的47个（召回率为85.5%），显著优于基线方法仅检测到29个（召回率52.7%）。
误报主要源于罕见但非异常的行为，如突然转向购票，表明模型对低频模式具有较高敏感性。
该方法成功检测到包含突然方向改变、环形运动及异常速度的异常事件，即使预测路径与真实路径非常接近。
该框架通过聚类LSTM隐藏状态实现了无需特征工程的异常事件检测，展现出强大的泛化能力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。