QUICK REVIEW

[论文解读] Learning Lane Graph Representations for Motion Forecasting

Ming Liang, Bin Yang|arXiv (Cornell University)|Jul 27, 2020

Autonomous Vehicle Technology and Safety参考文献 31被引用 37

一句话总结

该论文介绍 LaneGCN，一种基于车道-图的模型，结合 actor-map 融合，通过学习结构化地图表示并建模参与者与高清地图之间的交互，用于多模态运动预测，超越在 Argoverse 上的现有方法。

ABSTRACT

We propose a motion forecasting model that exploits a novel structured map representation as well as actor-map interactions. Instead of encoding vectorized maps as raster images, we construct a lane graph from raw map data to explicitly preserve the map structure. To capture the complex topology and long range dependencies of the lane graph, we propose LaneGCN which extends graph convolutions with multiple adjacency matrices and along-lane dilation. To capture the complex interactions between actors and maps, we exploit a fusion network consisting of four types of interactions, actor-to-lane, lane-to-lane, lane-to-actor and actor-to-actor. Powered by LaneGCN and actor-map interactions, our model is able to predict accurate and realistic multi-modal trajectories. Our approach significantly outperforms the state-of-the-art on the large scale Argoverse motion forecasting benchmark.

研究动机与目标

Motivate leveraging high-definition map topology for accurate motion forecasting
Propose a lane-graph representation learned by LaneGCN to capture complex lane topology
Model comprehensive interactions between traffic actors and lane graphs via a fusion network
Demonstrate end-to-end trainability and superior performance on Argoverse against raster-based methods

提出的方法

Construct a lane graph from vectorized HD map data to preserve map topology without rasterization
Develop LaneConv with multi-type adjacency (predecessor, successor, left, right) and dilations to capture long-range lane dependencies
Represent actors and lanes as nodes; extract actor features with 1D CNNs (ActorNet) and lane features with LaneGCN (MapNet)
Fuse actor and lane features with FusionNet through four interaction types: actor-to-lane, lane-to-lane, lane-to-actor, and actor-to-actor, using spatial attention and LaneGCN for L2L
Predict multi-modal future trajectories via a two-branch prediction header (regression for trajectories and classification for mode confidences)
Train end-to-end with a combined classification and regression loss, including a max-margin term for modality ranking

实验结果

研究问题

RQ1Does lane-graph based representation capture map topology more effectively than rasterized maps for motion forecasting?
RQ2Can LaneConv and LaneGCN effectively model long-range dependencies in lane topology?
RQ3Do actor-map interactions (A2L, L2L, L2A, A2A) improve forecasting accuracy over actor-only or map-only baselines?
RQ4What is the impact of ablations on map/actor fusion and lane graph operators on predictive performance?

主要发现

Model	minADE (K=1)	minFDE (K=1)	MR (K=1)	minADE (K=6)	minFDE (K=6)	MR (K=6)
Argoverse Baseline	2.96	6.81	0.81	2.34	5.44	0.69
Argoverse Baseline (NN)	3.45	7.88	0.87	1.71	3.29	0.54
Holmes (7th)	2.91	6.54	0.82	1.38	2.66	0.42
cxx (3rd)	1.91	4.31	0.66	0.99	1.71	0.19
uulm-mrm (2nd)	1.90	4.19	0.63	0.94	1.55	0.22
Jean (1st)	1.86	4.18	0.63	0.93	1.49	0.19
Our Model	1.71	3.78	0.59	0.87	1.36	0.16

Significant improvements over state-of-the-art on Argoverse across minADE, minFDE, and MR for both K=1 and K=6
LaneGCN with multi-type and dilated LaneConv better captures lane topology than vanilla GCNs
Incorporating A2L, L2L, L2A, and A2A interactions materially improves performance, with map-informed flows enhancing actor interactions
Ablation studies show that each component (LaneConv, residual blocks, dilation, and fusion blocks) contributes to performance gains
Qualitative results illustrate improved handling of hard cases such as missing history, left/right turns, and abrupt maneuvers

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。