QUICK REVIEW

[論文レビュー] Graph-based Topology Reasoning for Driving Scenes

Tianyu Li, Li Chen|arXiv (Cornell University)|Apr 11, 2023

Advanced Image and Video Retrieval Techniques被引用数 11

ひとこと要約

TopoNet は、シーングラフ上で車線の連結性と車線と交通要素との関係をモデル化することで、知覚とトポロジー推論を統合したエンドツーエンドのフレームワークであり、シーングラフニューラルネットワークとシーン知識グラフを用いて運転シーンを扱う。

ABSTRACT

Understanding the road genome is essential to realize autonomous driving. This highly intelligent problem contains two aspects - the connection relationship of lanes, and the assignment relationship between lanes and traffic elements, where a comprehensive topology reasoning method is vacant. On one hand, previous map learning techniques struggle in deriving lane connectivity with segmentation or laneline paradigms; or prior lane topology-oriented approaches focus on centerline detection and neglect the interaction modeling. On the other hand, the traffic element to lane assignment problem is limited in the image domain, leaving how to construct the correspondence from two views an unexplored challenge. To address these issues, we present TopoNet, the first end-to-end framework capable of abstracting traffic knowledge beyond conventional perception tasks. To capture the driving scene topology, we introduce three key designs: (1) an embedding module to incorporate semantic knowledge from 2D elements into a unified feature space; (2) a curated scene graph neural network to model relationships and enable feature interaction inside the network; (3) instead of transmitting messages arbitrarily, a scene knowledge graph is devised to differentiate prior knowledge from various types of the road genome. We evaluate TopoNet on the challenging scene understanding benchmark, OpenLane-V2, where our approach outperforms all previous works by a great margin on all perceptual and topological metrics. The code is released at https://github.com/OpenDriveLab/TopoNet

研究の動機と目的

知覚を超えた運転シーンのトポロジーを、車線連結性と TE-to-LC の割り当てを共同で学ぶことで理解する。
交通要素の意味知識を車線中心線と統一された特徴空間に統合する。
車線と交通要素の表現を精錬するために、シーングラフ上で明示的なメッセージパッシングを可能にする。

提案手法

共有特徴抽出器を用いて、交通要素（TE）と中心線（LC）を処理する二分岐アーキテクチャ。
TEとLC用のデフォーマブルアテンションを用いたインスタンスクエリとデコーダー；共有トランスフォーマーベースのデコード。
LCとTE間でメッセージを伝搬するシーングラフニューラルネットワーク（SGNN）。2つの有向グラフ（G_ll と G_lt）上のGCNを介して。
TEの意味をLCクエリと相互作用するための統一特徴空間へ写像する埋め込みネットワーク。
タイプ別に事前トポロジー知識を注入するシーン知識グラフ。TEクラス間およびレーンの前任者/前任者関係にわたる学習可能な重みを使用。
TEとLCの検出損失を含む損失と、LC-LCおよびLC-TE関係のトポロジー損失；監督のためのHungarianマッチング。

実験結果

リサーチクエスチョン

RQ1従来の知覚タスクを超えて、複数視点画像から運転シーンのトポロジーをどのように正確に推定できるか？
RQ2グラフベースのネットワークは、明示的なトポロジー事前知識を用いて、車線連結性と車線と交通要素割り当てを共同で推論できるか？
RQ3明示的なシーン知識グラフを取り入れることで、トポロジー推論と知覚精度は運転シーンで改善されるか？

主な発見

TopoNet は知覚およびトポロジー指標の両方で OpenLane-V2 における従来手法を上回り、特に指向センターラインの知覚とトポロジー推論で顕著な改善を示す。
難易度の高いトポロジー推論ベンチマークで、センターライン知覚が従来手法と比較して 15-84% 向上。
BEV分割とセンターライン関連指標も SGNN とシーン知識グラフを組み込むと改善を示す。
アブレーション研究は、特徴相互作用とトポロジー予測を強化する上で、SGNN設計と知識グラフの有効性を裏付ける。
本手法は ResNet-50 バックボーンとデフォーマブルアテンションおよび BEV変換を用いて高い性能を達成。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。