[論文レビュー] Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving
The paper improves stereo-based depth estimation and combines it with sparse LiDAR through a depth-correction graph to achieve state-of-the-art pseudo-LiDAR performance, especially for faraway objects, on KITTI. It reports substantial gains over prior pseudo-LiDAR and competitive results with sparse LiDAR.
Detecting objects such as cars and pedestrians in 3D plays an indispensable role in autonomous driving. Existing approaches largely rely on expensive LiDAR sensors for accurate depth information. While recently pseudo-LiDAR has been introduced as a promising alternative, at a much lower cost based solely on stereo images, there is still a notable performance gap. In this paper we provide substantial advances to the pseudo-LiDAR framework through improvements in stereo depth estimation. Concretely, we adapt the stereo network architecture and loss function to be more aligned with accurate depth estimation of faraway objects --- currently the primary weakness of pseudo-LiDAR. Further, we explore the idea to leverage cheaper but extremely sparse LiDAR sensors, which alone provide insufficient information for 3D detection, to de-bias our depth estimation. We propose a depth-propagation algorithm, guided by the initial depth estimates, to diffuse these few exact measurements across the entire depth map. We show on the KITTI object detection benchmark that our combined approach yields substantial improvements in depth estimation and stereo-based 3D object detection --- outperforming the previous state-of-the-art detection accuracy for faraway objects by 40%. Our code is available at https://github.com/mileyan/Pseudo_Lidar_V2.
研究の動機と目的
- Motivate accurate 3D object detection without expensive LiDAR by focusing on depth estimation quality in stereo-based methods.
- Adapt stereo networks to optimize direct depth estimation rather than disparity.
- Leverage extremely sparse LiDAR (e.g., 4 beams) to de-bias and refine depth estimates via graph-based propagation.
- Demonstrate that combining SDN and GDC yields substantial improvements on KITTI, narrowing the gap to LiDAR-based methods.
提案手法
- Introduce SDN, a stereo depth network that constructs a depth-focused cost volume and optimizes a depth loss instead of disparity loss.
- Replace the traditional disparity cost volume with a depth cost volume so convolutions act on depth space rather than disparity.
- Incorporate a depth-to-disparity transform to align the depth cost volume with the stereo pipeline through bilinear interpolation.
- Propose a graph-based depth correction (GDC) that uses sparse 4-beam LiDAR measurements as landmarks to guide diffusion of depths across a KNN graph.
- Formulate two optimizations: (i) W that reconstructs depths from neighboring points (for Z in stereo), and (ii) Z' that enforces landmark depths while preserving manifold structure.
- Combine SDN with GDC to produce pseudo-LiDAR++ (PL++), and evaluate with stereo-only, sparse-LiDAR, and LiDAR-assisted settings on KITTI.
実験結果
リサーチクエスチョン
- RQ1Can direct depth optimization (instead of disparity) improve far-field depth accuracy for stereo-based 3D detection?
- RQ2Does a depth-focused cost volume outperform traditional disparity-based volumes for 3D object detection on KITTI?
- RQ3Can extremely sparse LiDAR be used to correct and propagate depths to produce accurate dense depth maps for 3D detection?
- RQ4How much does graph-based depth propagation (GDC) improve detection accuracy when fused with stereo-derived pseudo-LiDAR?
- RQ5How close can stereo-plus-sparse-LiDAR approaches get to full LiDAR performance on KITTI?
主な発見
- SDN significantly reduces depth estimation errors in the far range compared to disparity-based methods.
- Replacing disparity cost with a depth cost volume yields measurable gains in BEV and 3D AP on KITTI.
- GDC effectively propagates a few exact LiDAR depths to dense depth maps, improving object localization especially for faraway objects.
- PL++ (SDN + GDC) consistently outperforms PL across settings, and with 4-beam LiDAR can rival some LiDAR-based detectors.
- On KITTI validation, PL++ with SDN + GDC achieves notable BEV/AP gains (e.g., improving P-RCNN performance by substantial margins) and narrows the gap to 64-beam LiDAR baselines.
- Qualitative results illustrate better alignment of pseudo-LiDAR++ with ground-truth boxes for distant objects compared to prior pseudo-LiDAR.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。