QUICK REVIEW

[论文解读] Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks

Henggang Cui, Vladan Radosavljević|arXiv (Cornell University)|Sep 18, 2018

Autonomous Vehicle Technology and Safety参考文献 46被引用 27

一句话总结

本文提出了一种基于深度卷积神经网络（CNN）的多模态轨迹预测方法，利用交通场景的栅格化鸟瞰图表示，预测多个未来轨迹及其概率估计。该方法在长时程预测中显著优于单模态基线模型，其中M=3种模式时达到最优性能，且概率估计具有良好校准性。

ABSTRACT

Autonomous driving presents one of the largest problems that the robotics and artificial intelligence communities are facing at the moment, both in terms of difficulty and potential societal impact. Self-driving vehicles (SDVs) are expected to prevent road accidents and save millions of lives while improving the livelihood and life quality of many more. However, despite large interest and a number of industry players working in the autonomous domain, there still remains more to be done in order to develop a system capable of operating at a level comparable to best human drivers. One reason for this is high uncertainty of traffic behavior and large number of situations that an SDV may encounter on the roads, making it very difficult to create a fully generalizable system. To ensure safe and efficient operations, an autonomous vehicle is required to account for this uncertainty and to anticipate a multitude of possible behaviors of traffic actors in its surrounding. We address this critical problem and present a method to predict multiple possible trajectories of actors while also estimating their probabilities. The method encodes each actor's surrounding context into a raster image, used as input by deep convolutional networks to automatically derive relevant features for the task. Following extensive offline evaluation and comparison to state-of-the-art baselines, the method was successfully tested on SDVs in closed-course tests.

研究动机与目标

为应对交通行为中的高不确定性，通过预测周围交通参与者多个合理的未来轨迹，而非单一均值轨迹，来解决该问题。
通过建模人类驾驶行为的多模态特性，提升自动驾驶车辆的安全性和决策能力。
开发一种深度学习框架，利用栅格化场景上下文（高精地图与周围参与者）作为输入，实现端到端的轨迹预测。
评估并比较多种多模态预测架构，包括混合密度网络（MDN）、专家混合（ME）以及一种新型多模态轨迹预测（MTP）模型。
在真实世界封闭场地测试中验证该方法，证明其在自动驾驶系统中部署的实际可行性。

提出的方法

该方法将周围交通场景（包括高精地图和其它参与者的位置）编码为鸟瞰图（BEV）栅格图像，作为深度卷积神经网络的输入。
提出一种新型多模态轨迹预测（MTP）模型，通过可学习的模式选择策略，输出多个未来轨迹及其预测概率。
MTP模型在训练过程中采用基于距离的损失函数，包含两种变体：一种基于位移，另一种基于角度差异，以匹配预测模式与真实轨迹。
模型通过最小化预测模式与真实轨迹之间的距离进行训练，同时通过分桶分析对模式概率进行校准。
该方法采用多假设评估策略，根据位移或角度选择最佳匹配模式，其中基于角度的匹配在交叉路口表现更优。
模型在真实世界驾驶数据上进行离线评估，并在实际自动驾驶车辆的封闭场地测试中完成验证。

实验结果

研究问题

RQ1深度学习模型能否有效预测交通参与者多个合理的未来轨迹，而非单一均值轨迹，以更好地反映现实驾驶中的不确定性？
RQ2轨迹距离度量方式（位移 vs. 角度）的选择如何影响多模态轨迹预测的性能，特别是在转弯等复杂操作中？
RQ3在多模态轨迹预测中，预测模式数（M）的最优值是多少，以在预测精度与模型复杂度之间实现最佳平衡？
RQ4预测模式的概率在多大程度上是校准良好的？这对自动驾驶决策中轨迹预测的可靠性有何影响？
RQ5在多样化驾驶场景下，所提出的MTP模型与当前最先进基线方法相比，在短期和长期预测精度方面表现如何？

主要发现

当M=3种模式时，MTP模型在所有评估指标上均达到最佳整体性能，优于单模态及其他多模态基线模型。
在使用基于角度的模式匹配时，M=3的6秒预测误差在直行操作中降低至2.18米，在右转操作中降低至5.17米。
采用角度距离进行模式匹配相比位移，在转弯操作中性能提升0.4–0.5米，仅在直行情况下有0.03米的轻微退化。
模型展示了良好的预测概率校准性，模式概率校准图与y=x参考线高度吻合。
该方法成功捕捉了多模态行为，例如当M=4时，将“直行”模式分解为“快速”和“慢速”两种变体，反映了现实中纵向速度的多样性。
该模型在真实自动驾驶车辆的封闭场地测试中得到验证，证实其在动态环境中的实际可行性与鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。