QUICK REVIEW

[Paper Review] Spherical Convolutional Neural Network for 3D Point Clouds

Huan Lei, Naveed Akhtar|arXiv (Cornell University)|May 21, 2018

3D Shape Modeling and Analysis21 references20 citations

TL;DR

This paper proposes a spherical convolutional neural network (SCNN) for 3D point cloud processing that uses metric-based spherical kernels and octree-based spatial partitioning to enable efficient, translation-invariant, and asymmetric convolutions on irregular point clouds. The method achieves state-of-the-art performance on ModelNet10 and ModelNet40 with 93.2% and 89.7% accuracy, respectively, while avoiding costly K-NN or range searches through hierarchical octree structuring.

ABSTRACT

We propose a neural network for 3D point cloud processing that exploits `spherical' convolution kernels and octree partitioning of space. The proposed metric-based spherical kernels systematically quantize point neighborhoods to identify local geometric structures in data, while maintaining the properties of translation-invariance and asymmetry. The network architecture itself is guided by octree data structuring that takes full advantage of the sparse nature of irregular point clouds. We specify spherical kernels with the help of neurons in each layer that in turn are associated with spatial locations. We exploit this association to avert dynamic kernel generation during network training, that enables efficient learning with high resolution point clouds. We demonstrate the utility of the spherical convolutional neural network for 3D object classification on standard benchmark datasets.

Motivation & Objective

To address the challenge of applying convolutional networks to irregular 3D point clouds, which lack regular grid structure.
To overcome the limitations of existing methods that rely on dynamic kernel generation or expensive neighborhood searches like K-NN or range queries.
To develop a scalable, efficient, and geometrically meaningful convolution operation for 3D point clouds using spherical kernels.
To enable translation-invariant and asymmetric feature learning without requiring normal computation or dynamic kernel computation during training.
To demonstrate the effectiveness of the proposed SCNN architecture on standard 3D object recognition benchmarks.

Proposed method

The method uses spherical convolutional kernels that partition a 3D spherical neighborhood around each point into angular (azimuth/elevation) and radial bins.
Each bin is associated with a learnable weight matrix, enabling localized, geometrically meaningful feature aggregation.
The network architecture is guided by an octree data structure that hierarchically partitions 3D space, enabling efficient spatial indexing and reducing computational overhead.
Neurons in each layer are associated with spatial locations in the octree, allowing fixed kernel weights and eliminating the need for dynamic kernel generation during training.
The spherical kernel applies asymmetric weighting to point pairs, enabling compact and effective feature representation.
The network processes point clouds through successive layers, coarsening the point cloud at each level while learning hierarchical features.

Experimental results

Research questions

RQ1Can spherical convolutional kernels be designed to systematically quantify local geometric structures in 3D point clouds while maintaining translation-invariance and asymmetry?
RQ2How can octree-based spatial partitioning improve the scalability and efficiency of 3D point cloud processing compared to K-NN or range search?
RQ3Can a fixed, learnable kernel mechanism avoid the computational cost of dynamic kernel generation in point cloud networks?
RQ4To what extent does the proposed spherical convolutional network outperform existing methods in 3D object classification without relying on normal vectors?
RQ5How does the network’s performance scale with increasing input point cloud resolution and size?

Key findings

The proposed spherical convolutional neural network achieves 93.2% accuracy on the ModelNet10 class-level classification benchmark, outperforming prior methods including PointNet++ and ECC.
On the more challenging ModelNet40 instance-level classification task, the method attains 89.7% accuracy, surpassing PointNet++ and ECC while not using normal vectors.
The network demonstrates superior scalability: octree construction and forward pass times scale efficiently even for 50K-point clouds, with total inference time of 203ms per sample.
The method avoids expensive K-NN or range searches by using octree partitioning, which enables faster neighborhood computation than Kd-trees or K-NN for large point clouds.
Visualization shows that feature representations become coarser and more distinctive across layers, with learned spherical kernels capturing meaningful spatial patterns.
The ablation study confirms that data augmentation improves performance, with 93.2% and 89.7% accuracy on ModelNet10 and ModelNet40, respectively, when using 50K training points.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.