QUICK REVIEW

[论文解读] TNT: Target-driveN Trajectory Prediction

Hang Zhao, Jiyang Gao|arXiv (Cornell University)|Aug 19, 2020

Autonomous Vehicle Technology and Safety参考文献 38被引用 210

一句话总结

TNT 提出一个三阶段、目标驱动的多模态轨迹预测框架，该框架将未来目标离散化、以这些目标为条件来驱动物体运动，并对轨迹进行打分，以产生一组紧凑且可能性较高的未来；它在多个基准数据集上获得最先进的结果。

ABSTRACT

Predicting the future behavior of moving agents is essential for real world applications. It is challenging as the intent of the agent and the corresponding behavior is unknown and intrinsically multimodal. Our key insight is that for prediction within a moderate time horizon, the future modes can be effectively captured by a set of target states. This leads to our target-driven trajectory prediction (TNT) framework. TNT has three stages which are trained end-to-end. It first predicts an agent's potential target states $T$ steps into the future, by encoding its interactions with the environment and the other agents. TNT then generates trajectory state sequences conditioned on targets. A final stage estimates trajectory likelihoods and a final compact set of trajectory predictions is selected. This is in contrast to previous work which models agent intents as latent variables, and relies on test-time sampling to generate diverse trajectories. We benchmark TNT on trajectory prediction of vehicles and pedestrians, where we outperform state-of-the-art on Argoverse Forecasting, INTERACTION, Stanford Drone and an in-house Pedestrian-at-Intersection dataset.

研究动机与目标

通过显式建模一组离散的合理未来目标，推动鲁棒的多模态轨迹预测。
引入一个三阶段的端到端可训练框架，将目标预测、目标条件下的运动和轨迹打分分离。
证明目标能够捕获大部分长时域的不确定性，且在给定目标的条件下，运动大体上是单峰的。

提出的方法

阶段1（目标预测）：利用场景上下文对离散未来目标位置的分布进行预测；对目标进行过采样（例如，N ~ 1000），并输出具有相关偏移量的前 M 个目标；使用交叉熵训练目标概率，使用 Huber 损失训练偏移量。
阶段2（目标条件运动估计）：对于每个目标，预测一个以目标和上下文为条件的单峰轨迹；使用两层 MLP，在训练阶段使用教师 forcing。
阶段3（轨迹打分与选择）：使用最大熵模型对轨迹打分，学习对 K 个预测进行排序并选择一个多样且紧凑的集合；以与真实分数类似的分数作为目标，使用交叉熵损失；并使用非极大值抑制来去除近重复轨迹。

实验结果

研究问题

RQ1将未来目标离散化是否能够捕捉轨迹预测中大部分多模态不确定性？
RQ2在离散目标的条件下进行轨迹生成，是否能够在测试时不依赖潜变量采样的情况下实现准确、多样而紧凑的预测？
RQ3与最先进方法相比，TNT 流水线在驾驶场景和行人数据集上的表现如何？
RQ4目标采样密度对预测准确性和多样性有何影响？

主要发现

TNT 在四个基准数据集上达到最先进的结果：Argoverse、INTERACTION、PAID 和 SDD。
使用目标驱动的条件设定在紧凑的预测集合上实现了强召回率和准确性（例如，前 M 个目标和 K 条最终轨迹）。
密集目标采样在达到饱和点前可以提升性能，对于行人来说网格目标表现良好。
目标偏移回归和每个目标的单峰运动模型相较于不含这些组件的变体显著提高准确性。
与潜变量或锚点式方法相比，TNT 提供可解释的中间输出（目标）以及具有竞争力或更优的预测性能。
单个 TNT 模型在若干数据集上达到或超过挑战赛冠军的水平。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。