QUICK REVIEW

[Paper Review] Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection

Benjin Zhu, Zhengkai Jiang|arXiv (Cornell University)|Aug 26, 2019

Advanced Neural Network Applications31 references320 citations

TL;DR

The paper introduces class-balanced sampling and grouping with a multi-group head to tackle long-tailed class distributions in nuScenes, achieving state-of-the-art lidar-based 3D object detection results. It combines DS Sampling, GT-AUG, and a balanced multi-group head to boost tail-class performance.

ABSTRACT

This report presents our method which wins the nuScenes3D Detection Challenge [17] held in Workshop on Autonomous Driving(WAD, CVPR 2019). Generally, we utilize sparse 3D convolution to extract rich semantic features, which are then fed into a class-balanced multi-head network to perform 3D object detection. To handle the severe class imbalance problem inherent in the autonomous driving scenarios, we design a class-balanced sampling and augmentation strategy to generate a more balanced data distribution. Furthermore, we propose a balanced group-ing head to boost the performance for the categories withsimilar shapes. Based on the Challenge results, our methodoutperforms the PointPillars [14] baseline by a large mar-gin across all metrics, achieving state-of-the-art detection performance on the nuScenes dataset. Code will be released at CBGS.

Motivation & Objective

Address severe class imbalance in nuScenes 3D object detection.
Improve tail-class performance while maintaining overall accuracy.
Leverage multi-group head design to share information among similar-shaped categories.
Enhance data augmentation and training procedures to boost joint multi-class detection.

Proposed method

Use sparse 3D convolutions for feature extraction from voxelized point clouds.
Introduce DS Sampling to balance the training distribution by duplicating samples from rare classes.
Apply GT-AUG to augment data by pasting ground-trtruth boxes sampled from an annotation database.
Design a multi-group head where each group of similar-shape classes shares a dedicated head to reduce inter-class interference.
Group classes into six groups based on shape/size similarity and instance balance to guide the multi-group head learning.
Incorporate loss components including weighted focal loss for classification, smooth-L1 for regression, and orientation classification with offset to reduce angular ambiguity.

Experimental results

Research questions

RQ1How does class imbalance affect 3D object detection performance on nuScenes, especially for tail classes?
RQ2Can a class-balanced sampling strategy improve tail-class accuracy without sacrificing head-class performance?
RQ3Does grouping similar-shaped categories and using group-specific heads improve multi-class detection in point clouds?
RQ4What combination of data augmentation, loss design, and network architecture yields state-of-the-art lidar-based 3D detection on nuScenes?

Key findings

DS Sampling expands the training set from 28,130 to 128,100 samples, smoothing the class distribution.
The proposed 6-group arrangement (Car), (Truck, Construction Vehicle), (Bus, Trailer), (Barrier), (Motorcycle, Bicycle), (Pedestrian, Traffic Cone) improves tail-class performance.
The method achieves state-of-the-art results on the nuScenes lidar track with mAP and NDS metrics; reported gains include mAP improvements over PointPillars and competitive NDS.
GT-AUG and Res-Encoder contribute notably to mAP, as shown in ablation studies.
Final submission reported mAP of 53.2% and NDS of 63.78% on the validation split, surpassing the baselines.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.