QUICK REVIEW

[论文解读] Conf-Net: Predicting Depth Completion Error-Map For High-Confidence Dense 3D Point-Cloud.

Hamid Hekmatian, Samir Al-Stouhi|arXiv (Cornell University)|Jul 23, 2019

Advanced Vision and Imaging被引用 3

一句话总结

Conf-Net 提出了一种端到端的深度学习方法，用于预测深度补全的误差图，从而从稀疏的 LiDAR 数据中生成高置信度、近乎稠密的 3D 点云。通过仅移除 0.3% 的预测点，其 RMSE 降低至 399，比最先进方法低 60%，比使用 RGB 引导的方法低 50%，并在真实世界自动驾驶数据上实现了实时性能。

ABSTRACT

This work proposes a new method for depth completion of sparse LiDAR data using a convolutional neural network which learns to generate almost full 3D point-clouds with significantly lower root mean squared error (RMSE) over state-of-the-art methods. An almost dense high-confidence/low-variance point-cloud is more valuable for safety-critical applications specifically real-world autonomous driving than a dense point-cloud with high error rate and high variance. We examine the error of the standard depth completion methods and demonstrate that the error exhibits a long tail distribution which can be significantly reduced if a small portion of the generated depth points can be identified and removed. We add a purging step to our neural network and present a novel end-to-end algorithm that learns to predict a high-quality error-map of its prediction. Using our predicted error map, we demonstrate that by up-filling a LiDAR point cloud from 18,000 points to 285,000 points, versus 300,000 points for full depth, we can reduce the RMSE error from 1004 to 399. This error is approximately 60% less than the state-of-the-art and 50% less than the state-of-the-art with RGB guidance. We only need to remove 0.3% of the predicted points to get comparable results with the state-of-the-art which has RGB guidance. Our post-processing step takes the output of a standard encoder-decoder network, to generate high resolution 360 degrees dense point-cloud. In addition to analyzing our results on Kitti depth completion dataset, we demonstrate the real-world performance of our algorithm using data gathered with a Velodyne VLP-32C LiDAR mounted on our vehicle to verify the effectiveness and real-time performance of our algorithm for autonomous driving. Codes and demo videos are available at http://github.com/hekmak/Conf-net.

研究动机与目标

为解决在安全关键型自动驾驶应用中，密集深度补全输出的高误差和高方差问题。
通过识别并移除低置信度预测，减少深度补全误差的长尾分布。
开发一种端到端方法，联合学习深度补全与误差图预测，以提高可靠性。
在无需 RGB 引导的情况下，实现高置信度、近乎稠密的 3D 点云，且误差极小。
使用真实 LiDAR 数据在真实世界自动驾驶场景中验证该方法。

提出的方法

训练一种新颖的端到端卷积神经网络，用于预测其自身深度补全输出的高分辨率误差图。
利用误差图识别并仅移除最低置信度的 0.3% 的深度点，显著降低整体 RMSE。
该方法使用标准的编码器-解码器网络进行深度预测，随后通过由预测误差图引导的后处理清理步骤。
误差图与深度补全网络端到端联合训练，采用可微损失函数，对低置信度区域的高误差进行惩罚。
该方法在 KITTI 深度补全基准上进行评估，并使用 Velodyne VLP-32C LiDAR 收集的真实世界数据进行验证。
最终输出为最多包含 285,000 个点的 360 度稠密点云，具有高置信度和低方差。

实验结果

研究问题

RQ1在稀疏 LiDAR 数据中，于深度补全过程中预测误差图是否能显著降低整体 RMSE？
RQ2仅移除一小部分低置信度点，能在多大程度上提升深度补全的准确性？
RQ3与最先进方法相比，尤其是使用 RGB 引导的方法，该方法表现如何？
RQ4该方法是否能在真实世界自动驾驶场景中实现实时性能和高可靠性？
RQ5深度补全误差的长尾分布是否与神经网络可学习的置信度估计相关？

主要发现

当将 LiDAR 点从 18,000 个上采样至 285,000 个时，Conf-Net 将 RMSE 从 1004 降低至 399，相比最先进方法降低了 60%。
该方法在仅移除 0.3% 的预测点的情况下，性能与使用 RGB 引导的最先进方法相当，展现出极高的效率。
误差图预测使高置信度、低方差的深度补全成为可能，这对安全关键型应用而言比密集但易错的输出更具价值。
该方法实现了实时性能，并在搭载 Velodyne VLP-32C LiDAR 的移动车辆上采集的真实世界数据中得到验证。
在 RMSE 降低和可靠性方面，该方法优于所有非引导和 RGB 引导的最先进方法。
基于预测误差图的清理步骤有效缓解了深度补全中常见的长尾误差分布。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。