QUICK REVIEW

[论文解读] Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans

Nachiket Deo, Mohan M. Trivedi|arXiv (Cornell University)|Jan 3, 2020

Autonomous Vehicle Technology and Safety参考文献 47被引用 114

一句话总结

P2T 提出 Plans-to-Trajectories：在未知环境中通过推断基于网格的计划（通过 MaxEnt IRL）来预测多模态的行人和车辆轨迹，并使用以这些计划为条件、基于注意力的解码器生成连续轨迹。

ABSTRACT

We address the problem of forecasting pedestrian and vehicle trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure and the multimodal distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context to multiple future trajectories, we propose to condition trajectory forecasts on plans sampled from a grid based policy learned using maximum entropy inverse reinforcement learning (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals, and paths to those goals on a coarse 2-D grid defined over the scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publicly available Stanford drone and NuScenes datasets shows that our model generates trajectories that are diverse, representing the multimodal predictive distribution, and precise, conforming to the underlying scene structure over long prediction horizons.

研究动机与目标

使用过去的运动和场景布局在未知环境中预测行人和车辆的轨迹。
在粗略的二维网格上推断合理的目标和路径，无需预定义目标。
在可解释的计划表示下，根据采样的网格计划生成连续轨迹。
生成多样但符合场景的轨迹，并为后续规划提供简洁的预测集合。

提出的方法

重新表述 MaxEnt IRL，以联合推断瞬时路径奖励和终端目标奖励。
学习一个奖励模型（基于 CNN），将局部场景片段映射到网格单元处的路径和目标奖励。
使用非目标条件的 MaxEnt 策略来采样指向潜在目标的多模态网格基计划。
构建一个基于注意力的轨迹生成器，将采样的计划和运动历史映射到连续的未来轨迹。
使用编码器–解码器（GRU 和 BiGRU）加软注意力来训练轨迹生成器，以产生以计划为条件的轨迹。
将采样的轨迹聚类为 K 个代表性未来以供下游规划。

实验结果

研究问题

RQ1基于网格的 MaxEnt IRL 能否在未知场景中推断出合理的、多模态的目标和路径，而无需预先指定终点？
RQ2与先前的多模态方法相比，基于采样网格计划条件的轨迹是否更符合场景结构并展示出长时域的准确性？
RQ3基于计划条件的注意力驱动轨迹生成器是否能够产生多样但又精确的未来，以适用于自主系统的下游规划？

主要发现

该模型生成的轨迹在较长的时间尺度上符合底层场景，具有多样性。
P2T 在 Stanford Drone 与 NuScenes 数据集的若干评估指标上获得了强的样本质量度量，以及具竞争力或最先进的结果。
该方法在保持多样性的同时提高了精确度，解决了多模态预测中的常见召回-精确度权衡。
提供 K 个聚类轨迹可获得紧凑、便于规划的表示，而无需针对不同的 K 值重新训练模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。