QUICK REVIEW

[论文解读] A2D2: Audi Autonomous Driving Dataset

Jakob Geyer, Yohannes Kassahun|arXiv (Cornell University)|Apr 14, 2020

Remote Sensing and LiDAR Applications参考文献 28被引用 265

一句话总结

A2D2 提供一个商用可用、完全同步的多模态数据集（6 台摄像头，5 台激光雷达）覆盖360度，包括语义/实例分割和3D边界框，以及用于自动驾驶研究的广泛车辆总线数据，遵循 CC BY-ND 4.0。它包括来自德国的带标签和未标签序列以及获取教程。

ABSTRACT

Research in machine learning, mobile robotics, and autonomous driving is accelerated by the availability of high quality annotated data. To this end, we release the Audi Autonomous Driving Dataset (A2D2). Our dataset consists of simultaneously recorded images and 3D point clouds, together with 3D bounding boxes, semantic segmentation, instance segmentation, and data extracted from the automotive bus. Our sensor suite consists of six cameras and five LiDAR units, providing full 360 degree coverage. The recorded data is time synchronized and mutually registered. Annotations are for non-sequential frames: 41,277 frames with semantic segmentation image and point cloud labels, of which 12,497 frames also have 3D bounding box annotations for objects within the field of view of the front camera. In addition, we provide 392,556 sequential frames of unannotated sensor data for recordings in three cities in the south of Germany. These sequences contain several loops. Faces and vehicle number plates are blurred due to GDPR legislation and to preserve anonymity. A2D2 is made available under the CC BY-ND 4.0 license, permitting commercial use subject to the terms of the license. Data and further information are available at http://www.a2d2.audi.

研究动机与目标

通过提供一个商业可用、注释丰富的数据集来促进自动驾驶研究。
提供同步到全局参考帧的全景摄像头和激光雷达数据。
包括大量车辆总线数据，以支持端到端和强化学习研究。
提供去标识化数据与教程，方便社区采用。
实现基准测试和挑战，比较跨模态感知算法。

提出的方法

在奥迪Q7 e-tron 上进行数据采集，使用六个摄像头和五个 Velodyne VLP-16 激光雷达。
对传感器进行严格标定和注册到共同的全局参考坐标系。
为41,277帧跨38个类别进行语义和实例分割标注。
在前视摄像头视场内，为12,497帧前视图提供3D边界框。
发布392,556个未标注序列，用于自监督或SLAM研究。
使用ResNet-101编码器、PSP-Net解码器进行基线语义分割实验。

实验结果

研究问题

RQ1多模态、环视传感器的汽车数据集如何支持感知与SLAM算法的开发？
RQ2在A2D2上使用预训练权重和去标识化对语义分割绩效有何影响？
RQ3数据集包含的车辆总线数据是否能推动对象检测之外的更广泛研究（如端到端或强化学习）？

主要发现

架构/训练	平均IoU
基线（ResNet-101 + PSP-Net）	71.01%
使用预训练权重（ResNet-50 + PSP-Net）	68.40%
不使用预训练权重（ResNet-50 + PSP-Net）	65.31%
使用匿名化图像（ResNet-101 + PSP-Net）	70.94%

数据集提供了41,277张语义/实例标注图像，以及前视摄像头视野内的12,497帧的3D边界框。
五个激光雷达和六个摄像头实现360度全覆盖，数据经过时间同步和注册。
一个语义分割模型在18个前景类别上获得平均IoU为71.01%（基线ResNet-101 + PSP-Net）。
使用ImageNet预训练权重将平均IoU从68.40%（ResNet-50 + PSP-Net）提升到71.01%（基线），相比之下未使用预训练的70.94%与基线71.01%相比。
匿名化（人脸/车牌模糊）对平均IoU影响较小，为70.94%，对比未匿名化基线71.01%。
数据集支持通过未标注序列和车辆总线数据实现端到端及自监督学习。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。