QUICK REVIEW

[论文解读] Radar-Camera Sensor Fusion for Joint Object Detection and Distance Estimation in Autonomous Vehicles

Ramin Nabati, Hairong Qi|arXiv (Cornell University)|Sep 17, 2020

Advanced Neural Network Applications参考文献 29被引用 32

一句话总结

提出一种中等融合的雷达-摄像头网络，该网络生成基于雷达的3D提案，并用图像特征进行细化；与图像提案合并，在 nuScenes 上实现距离估计的联合检测。

ABSTRACT

In this paper we present a novel radar-camera sensor fusion framework for accurate object detection and distance estimation in autonomous driving scenarios. The proposed architecture uses a middle-fusion approach to fuse the radar point clouds and RGB images. Our radar object proposal network uses radar point clouds to generate 3D proposals from a set of 3D prior boxes. These proposals are mapped to the image and fed into a Radar Proposal Refinement (RPR) network for objectness score prediction and box refinement. The RPR network utilizes both radar information and image feature maps to generate accurate object proposals and distance estimations. The radar-based proposals are combined with image-based proposals generated by a modified Region Proposal Network (RPN). The RPN has a distance regression layer for estimating distance for every generated proposal. The radar-based and image-based proposals are merged and used in the next stage for object classification. Experiments on the challenging nuScenes dataset show our method outperforms other existing radar-camera fusion methods in the 2D object detection task while at the same time accurately estimates objects' distances.

研究动机与目标

开发一个雷达-摄像头融合框架，以提高自动驾驶的2D目标检测和距离估计。
利用雷达点云生成3D提案，并用图像特征对其进行细化以实现准确定位。
将雷达基础的提案与基于图像的提案结合，以在具有挑战性的场景中提升检测效果。
在对象分类之外，提供每次检测的距离估计。

提出的方法

采用中融合架构，其中雷达检测生成3D锚框并投影到2D图像提案，再用RPR网络结合图像骨干特征进行细化。
从对雷达点锚定的3D锚框生成雷达基础的提案，每个类别有两种方向，将其映射到图像以产生2D提案和深度。
用RPR网络对雷达提案进行细化，使用RoI池化并输出对象性分数和框的细化。
使用基于图像的Region Proposal Network (RPN)来生成互补的提案；在图像提案上增加距离回归层以估计深度。
通过基于IoU的匹配合并雷达和图像提案，在匹配处用雷达距离覆盖图像距离；随后进行第二阶段的Fast R-CNN式分类。
使用多任务损失进行训练，结合两条提案流的分类和回归损失，遵循Faster R-CNN风格的公式。

实验结果

研究问题

RQ1雷达点云是否能够有效转换为与图像数据对齐的3D目标提案，以实现联合检测和距离估计？
RQ2将雷达派生的提案与图像派生的提案融合，是否能在自动驾驶数据上提升2D检测性能和深度精度？
RQ3在同时利用雷达和图像模态时，如何更好地对每个检测对象估计距离？

主要发现

所提出的方法在 nuScenes 验证集上的2D目标检测方面优于 RRPN 和 CRF-Net。
该方法在所有图像上的距离估计平均绝对误差（MAE）为 2.65 米。
按类别的MAE结果显示，像汽车、卡车和巴士等较大对象的距离误差较高，原因是在雷达检测多、边缘到中心距离存在差异。
将雷达提案与图像提案结合后，相较基线获得了更高的 AP 和 AP50/AP75 指标。
同时利用雷达和图像流提供互补优势，提升整体检测性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。