Skip to main content
QUICK REVIEW

[Paper Review] Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection

Benjin Zhu, Zhengkai Jiang|arXiv (Cornell University)|Aug 26, 2019
Advanced Neural Network Applications31 references320 citations
TL;DR

The paper introduces class-balanced sampling and grouping with a multi-group head to tackle long-tailed class distributions in nuScenes, achieving state-of-the-art lidar-based 3D object detection results. It combines DS Sampling, GT-AUG, and a balanced multi-group head to boost tail-class performance.

ABSTRACT

This report presents our method which wins the nuScenes3D Detection Challenge [17] held in Workshop on Autonomous Driving(WAD, CVPR 2019). Generally, we utilize sparse 3D convolution to extract rich semantic features, which are then fed into a class-balanced multi-head network to perform 3D object detection. To handle the severe class imbalance problem inherent in the autonomous driving scenarios, we design a class-balanced sampling and augmentation strategy to generate a more balanced data distribution. Furthermore, we propose a balanced group-ing head to boost the performance for the categories withsimilar shapes. Based on the Challenge results, our methodoutperforms the PointPillars [14] baseline by a large mar-gin across all metrics, achieving state-of-the-art detection performance on the nuScenes dataset. Code will be released at CBGS.

Motivation & Objective

  • Address severe class imbalance in nuScenes 3D object detection.
  • Improve tail-class performance while maintaining overall accuracy.
  • Leverage multi-group head design to share information among similar-shaped categories.
  • Enhance data augmentation and training procedures to boost joint multi-class detection.

Proposed method

  • Use sparse 3D convolutions for feature extraction from voxelized point clouds.
  • Introduce DS Sampling to balance the training distribution by duplicating samples from rare classes.
  • Apply GT-AUG to augment data by pasting ground-trtruth boxes sampled from an annotation database.
  • Design a multi-group head where each group of similar-shape classes shares a dedicated head to reduce inter-class interference.
  • Group classes into six groups based on shape/size similarity and instance balance to guide the multi-group head learning.
  • Incorporate loss components including weighted focal loss for classification, smooth-L1 for regression, and orientation classification with offset to reduce angular ambiguity.

Experimental results

Research questions

  • RQ1How does class imbalance affect 3D object detection performance on nuScenes, especially for tail classes?
  • RQ2Can a class-balanced sampling strategy improve tail-class accuracy without sacrificing head-class performance?
  • RQ3Does grouping similar-shaped categories and using group-specific heads improve multi-class detection in point clouds?
  • RQ4What combination of data augmentation, loss design, and network architecture yields state-of-the-art lidar-based 3D detection on nuScenes?

Key findings

  • DS Sampling expands the training set from 28,130 to 128,100 samples, smoothing the class distribution.
  • The proposed 6-group arrangement (Car), (Truck, Construction Vehicle), (Bus, Trailer), (Barrier), (Motorcycle, Bicycle), (Pedestrian, Traffic Cone) improves tail-class performance.
  • The method achieves state-of-the-art results on the nuScenes lidar track with mAP and NDS metrics; reported gains include mAP improvements over PointPillars and competitive NDS.
  • GT-AUG and Res-Encoder contribute notably to mAP, as shown in ablation studies.
  • Final submission reported mAP of 53.2% and NDS of 63.78% on the validation split, surpassing the baselines.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.