QUICK REVIEW

[论文解读] History Repeats Itself: Human Motion Prediction via Motion Attention

Wei Mao, Miaomiao Liu|arXiv (Cornell University)|Jul 23, 2020

Human Pose and Action Recognition参考文献 31被引用 30

一句话总结

本文提出一种基于注意力的前馈模型，使用对历史运动子序列进行运动注意力，并以离散余弦变换（DCT）编码来预测未来的人体姿态，随后再由一个基于GCN的预测器进行预测。它在 Human3.6M、AMASS 和 3DPW 上达到最先进的结果。

ABSTRACT

Human motion prediction aims to forecast future human poses given a past motion. Whether based on recurrent or feed-forward neural networks, existing methods fail to model the observation that human motion tends to repeat itself, even for complex sports actions and cooking activities. Here, we introduce an attention-based feed-forward network that explicitly leverages this observation. In particular, instead of modeling frame-wise attention via pose similarity, we propose to extract motion attention to capture the similarity between the current motion context and the historical motion sub-sequences. Aggregating the relevant past motions and processing the result with a graph convolutional network allows us to effectively exploit motion patterns from the long-term history to predict the future poses. Our experiments on Human3.6M, AMASS and 3DPW evidence the benefits of our approach for both periodical and non-periodical actions. Thanks to our attention model, it yields state-of-the-art results on all three datasets. Our code is available at https://github.com/wei-mao-2019/HisRepItself.

研究动机与目标

激发并解决在人类运动在较长时间跨度内倾向重复自身的局限性。
开发在运动子序列上而非静态帧上工作的注意力机制。
利用长期历史运动模式以改进短期和长期预测。
将运动注意力与图卷积网络结合，以建模关节之间的空间依赖关系。
展示在多个数据集和动作类型上的泛化能力。

提出的方法

将过去的运动表示为子序列的序列，并对每个子序列进行离散余弦变换（DCT）编码。
将查询定义为最近观测到的子序列，将键/值定义为历史子序列及其DCT编码的未来。
通过对查询与键之间的点积进行归一化来计算注意力，然后聚合相应的DCT值以形成运动上下文向量。
将运动上下文向量与最新观测的运动结合，并输入到基于GCN的预测器中，以建模时空依赖。
通过在DCT域中预测残差并应用逆DCT以获得坐标或角度来预测未来姿态。
使用一个紧凑的两模块管道（运动注意力加预测器），参数约为3.4M。

实验结果

研究问题

RQ1相较于基于帧的注意力或现有方法，历史运动子序列上的运动注意力是否能够同时提升短期和长期的人体运动预测？
RQ2通过运动注意力利用长期重复的运动模式是否能在数据集和动作类型（H3.6M、AMASS、3DPW）之间实现泛化？
RQ3将DCT编码的运动历史与GCN预测器结合如何影响在不同预测时间范围内的预测质量和稳定性？

主要发现

所提出的运动注意力模型在三个数据集上在3D坐标和关节角度方面均取得短期和长期预测的最先进结果。
运动注意力聚合相关的过去运动子序列，使其能够有效利用超出短期历史的长期重复模式。
一个统一的模型可以处理短期和长期预测，而无需为不同时间范围准备单独的模型。
该方法在数据集（H3.6M、AMASS、3DPW）之间表现出强泛化，尤其在具有明显重复历史的动作上效果显著。
该模型保持紧凑（约3.4M参数）并使用简单的注意力机制且不使用softmax，以减轻梯度问题。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。