QUICK REVIEW

[论文解读] Graph-based Topology Reasoning for Driving Scenes

Tianyu Li, Li Chen|arXiv (Cornell University)|Apr 11, 2023

Advanced Image and Video Retrieval Techniques被引用 11

一句话总结

TopoNet 是一个端到端的框架，通过在场景图上对车道连接性和车道-交通要素关系进行建模，统一感知与拓扑推理，使用场景图神经网络和场景知识图来实现驾驶场景的推理。

ABSTRACT

Understanding the road genome is essential to realize autonomous driving. This highly intelligent problem contains two aspects - the connection relationship of lanes, and the assignment relationship between lanes and traffic elements, where a comprehensive topology reasoning method is vacant. On one hand, previous map learning techniques struggle in deriving lane connectivity with segmentation or laneline paradigms; or prior lane topology-oriented approaches focus on centerline detection and neglect the interaction modeling. On the other hand, the traffic element to lane assignment problem is limited in the image domain, leaving how to construct the correspondence from two views an unexplored challenge. To address these issues, we present TopoNet, the first end-to-end framework capable of abstracting traffic knowledge beyond conventional perception tasks. To capture the driving scene topology, we introduce three key designs: (1) an embedding module to incorporate semantic knowledge from 2D elements into a unified feature space; (2) a curated scene graph neural network to model relationships and enable feature interaction inside the network; (3) instead of transmitting messages arbitrarily, a scene knowledge graph is devised to differentiate prior knowledge from various types of the road genome. We evaluate TopoNet on the challenging scene understanding benchmark, OpenLane-V2, where our approach outperforms all previous works by a great margin on all perceptual and topological metrics. The code is released at https://github.com/OpenDriveLab/TopoNet

研究动机与目标

通过联合学习车道连接性与 TE-to-LC 分配，在超越感知的层面理解驾驶场景拓扑。
在统一的特征空间中将交通要素的语义知识与中心线整合。
在场景图上启用显式消息传递，以细化中心线和交通要素的表示。

提出的方法

两分支架构，使用共享特征提取器处理交通要素（TE）和中心线（LC）。
对 TE 和 LC 采用带变形注意力的实例查询解码器；共用的基于 Transformer 的解码。
场景图神经网络（SGNN）通过在两个有向图（G_ll 和 G_lt）上的GCN在 LC 与 TE 之间传播信息。
嵌入网络将 TE 的语义映射到统一的特征空间，以与 LC 查询交互。
场景知识图通过按类型注入先验拓扑知识，使用可学习权重跨 TE 类以及车道前驱/前驱关系。
损失函数包括 TE 和 LC 的检测损失，以及 LC-LC 和 LC-TE 关系的拓扑损失；采用匈牙利匹配进行监督。

实验结果

研究问题

RQ1如何从多视角图像中在超越传统感知任务的情况下准确推断驾驶场景的拓扑？
RQ2是否可以通过显式的拓扑先验，让基于图的网络同时对车道连接性和车道到交通要素的分配进行推理？
RQ3引入显式场景知识图是否会提升驾驶场景的拓扑推理和感知精度？

主要发现

在 OpenLane-V2 上，TopoNet 在感知和拓扑指标上超越了现有方法，在定向中心线感知和拓扑推理方面获得显著提升。
在具有挑战性的拓扑推理基准上，中心线感知相较于现有方法提升了 15%-84%。
在融入 SGNN 和场景知识图后，BEV 分割及中心线相关指标也显示出改进。
消融研究验证了 SGNN 设计和知识图在增强特征交互与拓扑预测方面的有效性。
该方法在使用 ResNet-50 主干、带变形注意力和 BEV 转换的情况下实现了强劲性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。