QUICK REVIEW

[논문 리뷰] Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection

Benjin Zhu, Zhengkai Jiang|arXiv (Cornell University)|2019. 08. 26.

Advanced Neural Network Applications참고 문헌 31인용 수 320

한 줄 요약

본 논문은 nuScenes의 긴 꼬리 클래스 분포를 해결하기 위해 클래스 균형 샘플링과 다그룹 헤드가 있는 그룹화를 도입하여, 라이다 기반 3D 객체 검출에서 최첨단 결과를 달성한다. DS Sampling, GT-AUG, 그리고 균형 잡힌 다그룹 헤드를 결합해 꼬리 클래스 성능을 향상시킨다.

ABSTRACT

This report presents our method which wins the nuScenes3D Detection Challenge [17] held in Workshop on Autonomous Driving(WAD, CVPR 2019). Generally, we utilize sparse 3D convolution to extract rich semantic features, which are then fed into a class-balanced multi-head network to perform 3D object detection. To handle the severe class imbalance problem inherent in the autonomous driving scenarios, we design a class-balanced sampling and augmentation strategy to generate a more balanced data distribution. Furthermore, we propose a balanced group-ing head to boost the performance for the categories withsimilar shapes. Based on the Challenge results, our methodoutperforms the PointPillars [14] baseline by a large mar-gin across all metrics, achieving state-of-the-art detection performance on the nuScenes dataset. Code will be released at CBGS.

연구 동기 및 목표

nuScenes 3D 객체 검출에서 심각한 클래스 불균형 해결.
전반적인 정확도를 유지하면서 꼬리 클래스 성능 향상.
유사한 모양의 범주 간 정보를 공유하기 위해 다그룹 헤드 설계 활용.
연합 다중 클래스 검출을 강화하기 위해 데이터 증강 및 학습 절차 강화.

제안 방법

복셀화된 포인트 클라우드에서 특징 추출을 위해 희소 3D 컨볼루션을 사용한다.
희귀 클래스의 샘플을 중복 복제하여 학습 분포를 균형화하는 DS Sampling 도입.
주석 데이터베이스에서 샘플된 그라운드 트루스 박스를 붙여 데이터 증강하는 GT-AUG 적용.
유사 형상 클래스 그룹마다 전용 헤드를 공유하는 다그룹 헤드 설계로 클래스 간 간섭을 줄인다.
형상/크기 유사성 및 인스턴스 균형에 따라 클래스를 여섯 그룹으로 나누어 다그룹 헤드 학습을 유도한다.
가중 초점 로스(weighted focal loss) 분류, 회귀를 위한 smooth-L1, 오프셋이 있는 방향 분류를 포함한 로스 구성요소를 도입하여 각도 모호성을 줄인다.

실험 결과

연구 질문

RQ1How does class imbalance affect 3D object detection performance on nuScenes, especially for tail classes?
RQ2Can a class-balanced sampling strategy improve tail-class accuracy without sacrificing head-class performance?
RQ3Does grouping similar-shaped categories and using group-specific heads improve multi-class detection in point clouds?
RQ4What combination of data augmentation, loss design, and network architecture yields state-of-the-art lidar-based 3D detection on nuScenes?

주요 결과

모달리티	맵	외부	mAP	mATE	mASE	mAOE	mAVE	mAAE	NDS
Point Pillars [14]	Lidar	×	30.5	0.517	0.290	0.500	0.316	0.368	45.3
BRAVE [17]	Lidar	×	32.4	0.400	0.249	0.763	0.272	0.090	48.4
Tolist [17]	Lidar	×	42.0	0.364	0.255	0.438	0.270	0.319	54.5
MEGVII(Ours)	Lidar	×	52.8	0.300	0.247	0.380	0.245	0.140	63.3

DS Sampling expands the training set from 28,130 to 128,100 samples, smoothing the class distribution.
The proposed 6-group arrangement (Car), (Truck, Construction Vehicle), (Bus, Trailer), (Barrier), (Motorcycle, Bicycle), (Pedestrian, Traffic Cone) improves tail-class performance.
The method achieves state-of-the-art results on the nuScenes lidar track with mAP and NDS metrics; reported gains include mAP improvements over PointPillars and competitive NDS.
GT-AUG and Res-Encoder contribute notably to mAP, as shown in ablation studies.
Final submission reported mAP of 53.2% and NDS of 63.78% on the validation split, surpassing the baselines.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.