QUICK REVIEW

[论文解读] PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding

Saining Xie, Jiatao Gu|arXiv (Cornell University)|Jul 21, 2020

Advanced Neural Network Applications参考文献 85被引用 95

一句话总结

论文证明在大型3D场景上进行无监督预训练的 PointContrast 能提升对多样数据集的高层3D任务的迁移，接近有监督预训练的性能。

ABSTRACT

Arguably one of the top success stories of deep learning is transfer learning. The finding that pre-training a network on a rich source set (eg., ImageNet) can help boost performance once fine-tuned on a usually much smaller target set, has been instrumental to many applications in language and vision. Yet, very little is known about its usefulness in 3D point cloud understanding. We see this as an opportunity considering the effort required for annotating data in 3D. In this work, we aim at facilitating research on 3D representation learning. Different from previous works, we focus on high-level scene understanding tasks. To this end, we select a suite of diverse datasets and tasks to measure the effect of unsupervised pre-training on a large source set of 3D scenes. Our findings are extremely encouraging: using a unified triplet of architecture, source dataset, and contrastive loss for pre-training, we achieve improvement over recent best results in segmentation and detection across 6 different benchmarks for indoor and outdoor, real and synthetic datasets -- demonstrating that the learned representation can generalize across domains. Furthermore, the improvement was similar to supervised pre-training, suggesting that future efforts should favor scaling data collection over more detailed annotation. We hope these findings will encourage more research on unsupervised pretext task design for 3D deep learning.

研究动机与目标

通过无监督预训练激励并实现对3D点云理解的迁移学习。
在多个高层下游任务上评估一个统一的骨干网络、源数据集和预文本任务。
提出并比较两种对比性预训练损失，用于密集点级学习。
展示从室内到室外、从合成到真实数据的跨域泛化。

提出的方法

使用 Sparse Residual U-Net 作为用于预训练和微调的统一骨干网络。
在基于 ScanNet 的大规模对数据集（870K 对）上使用 PointContrast 进行预训练。
对点云的两个视图进行训练，并使用对比目标学习点级表示。
评估两种损失：Hardest-Contrastive loss 和 PointInfoNCE loss。
在包括分割和检测在内的多个数据集上的多样下游任务进行微调。
证明无监督预训练的增益可与有监督预训练相媲美，并且随数据量增加而扩展。

实验结果

研究问题

RQ13D点云的无监督预训练是否能迁移到高层场景理解任务？
RQ2用大型3D场景源训练的统一骨干在室内/室外、真实/合成域之间的泛化能力如何？
RQ3不同的对比预训练损失如何影响迁移性与稳定性？
RQ4对于3D 表征，扩展预训练数据是否比任务特定标注数据更有利？

主要发现

PointContrast 在分割和检测的6个下游基准上提升了迁移能力。
PointInfoNCE 在许多任务中通常优于 Hardest-Contrastive，例如在分割和检测上的提升。
在 ScanNet 上使用 PointContrast 进行预训练在若干基准上达到最先进结果，并展示对室外和合成数据的跨域泛化。
无监督预训练的增益可与有监督预训练相比，表明扩充数据规模可能比更精细的标注更具影响力。
使用 PointContrast 特征进行微调可同时提升定位和分割，定位指标（如 mAP@0.5）增益更大。
采用统一架构和源数据集的方法，在室内/室外场景、真实/合成数据上均取得改善。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。