QUICK REVIEW

[論文レビュー] PointRGCN: Graph Convolution Networks for 3D Vehicles Detection Refinement

Jesús Zarzar, Silvio Giancola|arXiv (Cornell University)|Nov 27, 2019

Advanced Neural Network Applications参考文献 38被引用数 53

ひとこと要約

PointRGCN は LiDAR からの 3D 車両検出を改善するために GCN ベースのリファインメントモジュールを二つ導入（R-GCN と C-GCN）し、KITTI の結果で競争力を持ち、easy データで BEV AP の顕著な向上を達成。

ABSTRACT

In autonomous driving pipelines, perception modules provide a visual understanding of the surrounding road scene. Among the perception tasks, vehicle detection is of paramount importance for a safe driving as it identifies the position of other agents sharing the road. In our work, we propose PointRGCN: a graph-based 3D object detection pipeline based on graph convolutional networks (GCNs) which operates exclusively on 3D LiDAR point clouds. To perform more accurate 3D object detection, we leverage a graph representation that performs proposal feature and context aggregation. We integrate residual GCNs in a two-stage 3D object detection pipeline, where 3D object proposals are refined using a novel graph representation. In particular, R-GCN is a residual GCN that classifies and regresses 3D proposals, and C-GCN is a contextual GCN that further refines proposals by sharing contextual information between multiple proposals. We integrate our refinement modules into a novel 3D detection pipeline, PointRGCN, and achieve state-of-the-art performance on the easy difficulty for the bird eye view detection task.

研究の動機と目的

LiDAR の点群のグラフ表現を用いて 3D 車両検出のリファインメントを動機づける。
提案ごと（R-GCN）およびフレーム間の文脈（C-GCN）グラフモジュールを導入して検出提案をリファインする。
PointRCNN 提案に基づく二段階検出パイプラインへ R-GCN と C-GCN を統合する。
KITTI 3D 車両検出でベースラインに対する精度向上を示す。

提案手法

PointRCNN からの提案をグラフベースのモジュールでリファインする二段階検出パイプラインを採用。
R-GCN を導入して各提案のカノニカルフレーム内の点を処理することで提案ごとの特徴を抽出。
C-GCN を導入して同じフレーム内の提案間で EdgeConv 層を介して文脈情報を集約。
受容野を強化するために残差接続、膨張、および動的グラフ更新を用いる。
R-GCN と C-GCN の特徴を組み合わせて最終検出予測を行う。分類と二形態の回帰（ビニングと残差）。
マルチタスク損失として、提案分類と提案回帰（ビニングと残差ターゲットを含む）で訓練する。

実験結果

リサーチクエスチョン

RQ1グラフベースのリファインメントモジュールは PointRCNN を超える3D 車両提案の分類と回帰を改善できるか？
RQ2提案ごとの特徴集約（R-GCN）と提案間の文脈（C-GCN）を組み合わせると BEV の局在化と 3D ボックスの精度は向上するか？
RQ3GCN の選択（MRGCN vs EdgeConv）、深さ、残差、膨張が KITTI の性能に与える影響は？
RQ4提案されたパイプラインは KITTI Easy/Moderate/Hard サブセットにおける最先端の LiDAR 専用検出器と比較してどの程度の性能を示すか？

主な発見

方法	モダリティ	3D Easy	3D Moderate	3D Hard	BEV Easy	BEV Moderate	BEV Hard	時間 (ms)
MV3D [5]	L+I	66.77	52.73	51.31	85.82	77.00	68.94	240
AVOD [10]	L+I	73.59	65.78	58.38	86.80	85.44	77.73	100
AVOD-FPN [10]	L+I	81.94	71.88	66.38	88.53	83.79	77.90	100
F-PointNet [21]	L+I	81.20	70.39	62.19	88.70	84.00	75.33	170
UberATG-MMF [16]	L+I	86.81	76.75	68.41	89.49	87.47	79.10	80
VoxelNet [42]	L	77.49	65.11	57.73	89.35	79.26	77.39	220
PIXOR [39]	L	-	-	-	84.44	80.04	74.31	100
SECOND [38]	L	83.13	73.66	66.20	88.07	79.37	77.95	50
PointPillars [11]	L	79.05	74.99	68.30	88.35	86.10	79.83	16
PointRCNN [26]	L	85.94	75.76	68.32	89.47	85.68	79.10	100
Fast Point R-CNN [6]	L	84.28	75.73	67.39	88.03	86.10	78.17	65
STD [40]	L	86.61	77.63	76.06	89.66	87.76	86.89	80
R-GCN only (ours)	L	83.42	75.26	68.73	91.91	86.05	81.05	239
PointRGCN (ours)	L	85.97	75.73	70.60	91.63	87.49	80.73	262

PointRGCN は KITTI の 3D 車両検出で競争力のある結果を達成し、報告された表において LiDAR 専用手法のうちおおよそ二番手に位置する。
Full PointRGCN パイプラインは KITTI Easy サブセットで複数のベースラインと比較して約 2% の AP BEV 向上をもたらす。
R-GCN のみは Hard 設定で PointRCNN を上回り、R-GCN と C-GCN の組み合わせは Easy および Moderate カテゴリで利得をもたらす。
R-GCN と C-GCN は補完的な利得を提供する：R-GCN は提案ごとの局所特徴に焦点を当て、C-GCN は提案間の文脈を捉える。
アブレーションは、残差接続と膨張が性能に大きく影響することを示し、MRGCN と EdgeConv は速度とメモリのトレードオフを生む。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。