QUICK REVIEW

[论文解读] One Million Scenes for Autonomous Driving: ONCE Dataset

Jiageng Mao, Minzhe Niu|arXiv (Cornell University)|Jun 21, 2021

Advanced Neural Network Applications参考文献 56被引用 133

一句话总结

本文介绍 ONCE 数据集，包含 1 million LiDAR 场景和 7 million 张图像用于 3D 目标检测，以及一个基于 ONCE 的自监督/半监督/无监督方法在 3D 检测上的基准评估。它还分析了与现有数据集相比的数据质量、多样性以及领域自适应潜力。

ABSTRACT

Current perception models in autonomous driving have become notorious for greatly relying on a mass of annotated data to cover unseen cases and address the long-tail problem. On the other hand, learning from unlabeled large-scale collected data and incrementally self-training powerful recognition models have received increasing attention and may become the solutions of next-generation industry-level powerful and robust perception models in autonomous driving. However, the research community generally suffered from data inadequacy of those essential real-world scene data, which hampers the future exploration of fully/semi/self-supervised methods for 3D perception. In this paper, we introduce the ONCE (One millioN sCenEs) dataset for 3D object detection in the autonomous driving scenario. The ONCE dataset consists of 1 million LiDAR scenes and 7 million corresponding camera images. The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available (e.g. nuScenes and Waymo), and it is collected across a range of different areas, periods and weather conditions. To facilitate future research on exploiting unlabeled data for 3D detection, we additionally provide a benchmark in which we reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset. We conduct extensive analyses on those methods and provide valuable observations on their performance related to the scale of used data. Data, code, and more information are available at https://once-for-auto-driving.github.io/index.html.

研究动机与目标

通过提供一个大规模、多样化的 3D 场景数据集来解决自动驾驶数据不足的问题。
通过一个自监督/半监督/无监督学习基准促进对未标注数据在 3D 检测中的探索。
促进跨领域研究，分析 3D 感知中的数据质量、多样性和泛化能力。

提出的方法

收集并对 LiDAR 与相机数据进行下采样，形成 1M 个 3D 场景和 7M 张图像，覆盖 144 小时的驾驶过程。
对 16k 个场景进行 3D 框注释，涵盖 5 个类别，并将其投影到图像的 2D 框。
为所有场景提供天气、时间和区域标签，并分成 train/val/test，同时设有大规模未标注数据池。
在 ONCE 上以统一设置基准评估 3D 检测器（单模态和多模态）。
复现并评估自监督、半监督和无监督领域自适应在 3D 检测上的方法。
通过预训练效果和分布比较分析数据质量与多样性。

实验结果

研究问题

RQ1在 ONCE 上的预训练如何影响下游 3D 检测性能，相比 nuScenes 和 Waymo？
RQ2通过自监督/半监督方法使用未标注数据对 ONCE 上的 3D 目标检测有何影响？
RQ3在不同数据规模下，不同的自监督和半监督策略在 3D 检测中的表现如何？
RQ4无监督领域自适应是否能提升涉及 ONCE 的跨数据集 3D 检测？
RQ5数据多样性（天气、时间、区域）在自动驾驶场景检测性能中的作用是？

主要发现

ONCE 提供更优的预训练收益；在对 KITTI 进行微调时，基于 ONCE 预训练的模型在 3D mAP 上高于 nuScenes/Waymo 预训练。
在使用未标注的 ONCE 数据时，自监督/半监督方法能提升 3D 检测，且随着未标注数据量增加，性能提升也提升。
基于聚类的自监督方法（SwAV、DeepCluster）在 ONCE 的大规模设置下通常优于对比方法（BYOL、PointContrast）。
半监督方法（Mean Teacher、SESS、3DIoUMatch）获得显著提升，在大规模未标注数据上 Mean Teacher 的 mAP 最高可达 59.99%。
从/到 ONCE 的无监督领域自适应相比仅源基线显示出有意义的改进，但与 Oracle 性能仍有差距。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。