QUICK REVIEW

[论文解读] Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving

Mina Alibeigi, William Ljungbergh|arXiv (Cornell University)|May 3, 2023

Advanced Neural Network Applications被引用 11

一句话总结

Zenseact Open Dataset (ZOD) 是欧洲规模庞大的多模态自动驾驶数据集，具备高分辨率传感器、用于远距离感知的丰富注释，以及宽松的 CC BY-SA 4.0 许可证，包括 Frames、Sequences 与 Drives，以支持多样化任务。

ABSTRACT

Existing datasets for autonomous driving (AD) often lack diversity and long-range capabilities, focusing instead on 360° perception and temporal reasoning. To address this gap, we introduce Zenseact Open Dataset (ZOD), a large-scale and diverse multimodal dataset collected over two years in various European countries, covering an area 9x that of existing datasets. ZOD boasts the highest range and resolution sensors among comparable datasets, coupled with detailed keyframe annotations for 2D and 3D objects (up to 245m), road instance/semantic segmentation, traffic sign recognition, and road classification. We believe that this unique combination will facilitate breakthroughs in long-range perception and multi-task learning. The dataset is composed of Frames, Sequences, and Drives, designed to encompass both data diversity and support for spatio-temporal learning, sensor fusion, localization, and mapping. Frames consist of 100k curated camera images with two seconds of other supporting sensor data, while the 1473 Sequences and 29 Drives include the entire sensor suite for 20 seconds and a few minutes, respectively. ZOD is the only large-scale AD dataset released under a permissive license, allowing for both research and commercial use. More information, and an extensive devkit, can be found at https://zod.zenseact.com

研究动机与目标

满足除全景 360 度感知数据集之外对多样且远距离的自动驾驶数据的需求。
提供高质量、多模态传感数据并附有大量注释，支持多种感知任务。
通过 Frames、Sequences 与 Drives 子划分，推动 AD 的多任务学习和领域自适应。

提出的方法

描述在欧洲覆盖两年的传感器套件与数据采集设置。
定义三种数据类别（Frames、Sequences、Drives），并分别具有不同的预期任务。
提供对语义/实例分割、2D/3D 边界框，以及道路/交通标志标签的人工、分层注释。
对人脸和车牌进行两种去识别处理（模糊与 DNAT），并同时发布两种版本，以研究去识识别的影响。
以 CC BY-SA 4.0 发布数据集，并附带广泛的开发工具包以促进快速实验。

Figure 1 : Geographical coverage comparison with other AD datasets using the diversity area metric defined in [ 27 ] (top left), and geographical distribution of ZOD Frames overlaid on the map. The numbers in the quantized regions represent the amount of annotated frames in that geographical region.

实验结果

研究问题

RQ1与现有自动驾驶数据集相比，ZOD 的多样性和地理覆盖程度有多大？
RQ2高分辨率传感器和远距离注释是否能够实现强健的远距离感知与多任务学习？
RQ3去识别技术对在 ZOD 上训练的下游视觉模型有何影响？
RQ4Frame、Sequence 和 Drive 子集如何支持感知、定位、制图和规划等不同的自动驾驶任务？

主要发现

ZOD Frames 覆盖 14 个欧洲国家，使用 75 m 自我姿态度量时的多样性面积为 705,000 m^2 (7.05e5 km^2)，表明具有强烈的地理多样性。
对象注释距离达到 245 米，前置摄像头 8MP 与顶棚 LiDAR 提供的高远距离感知，在同等规模数据集中不常见。
交通标志分类包含 156 类，标注实例数达 44.6 万，是在可比数据集中标志实例数量最多的之一。
去识别实验表明，当在去识别数据上训练后，在原始图像上评估的 Faster-RCNN 和 YOLOv7 检测器的性能并无统计显著下降。
3D 目标检测显示距离相关的性能，CenterPoint 在 0–150 m 的 mAP 为 0.25，在 150–250 m 降至 0.01，凸显远距离感知挑战（表 4）。
交通标志识别在常见标志上表现强劲，但在罕见标志上存在显著的长尾挑战（宏观 F1 低于微观 F1）。

Figure 2 : Placement of sensors used by data collection vehicles in ZOD and their corresponding coordinate systems.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。