Skip to main content
QUICK REVIEW

[Paper Review] PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation

Mingyang Jiang, Yiran Wu|arXiv (Cornell University)|Jul 2, 2018
3D Shape Modeling and Analysis29 references341 citations
TL;DR

PointSIFT introduces orientation-encoding units and multi-scale feature representations to pointNet-based architectures, improving 3D semantic segmentation accuracy on ScanNet and S3DIS benchmarks.

ABSTRACT

Recently, 3D understanding research sheds light on extracting features from point cloud directly, which requires effective shape pattern description of point clouds. Inspired by the outstanding 2D shape descriptor SIFT, we design a module called PointSIFT that encodes information of different orientations and is adaptive to scale of shape. Specifically, an orientation-encoding unit is designed to describe eight crucial orientations, and multi-scale representation is achieved by stacking several orientation-encoding units. PointSIFT module can be integrated into various PointNet-based architecture to improve the representation ability. Extensive experiments show our PointSIFT-based framework outperforms state-of-the-art method on standard benchmark datasets. The code and trained model will be published accompanied by this paper.

Motivation & Objective

  • Motivate direct 3D point cloud understanding with robust local descriptors inspired by SIFT.
  • Propose PointSIFT module to encode orientation information across eight directions.
  • Achieve scale-awareness by stacking orientation-encoding units for multi-scale representation.
  • Integrate PointSIFT into a PointNet++-based encoder-decoder framework to improve segmentation performance.

Proposed method

  • Introduce Orientation-Encoding (OE) unit that aggregates features along eight spatial orientations via an 8-neighborhood search (S8N) and a 3-stage orientation-encoding convolution.
  • Stack multiple OE units to create a multi-scale PointSIFT module with shortcut connections for scale-aware feature fusion.
  • Embed PointSIFT modules between set abstraction (SA) and feature propagation (FP) layers in a PointNet++-like encode-decode architecture.
  • Use FP-shortcuts to connect corresponding SA and FP layers to preserve low-level information and speed up convergence.
  • Train end-to-end on raw point clouds with MLPs for initial feature embedding and perform downsampling (SA) and upsampling (FP) with PointSIFT interleaved.
  • Demonstrate improvements over state-of-the-art baselines on S3DIS and ScanNet semantic segmentation benchmarks.

Experimental results

Research questions

  • RQ1Can a SIFT-inspired, orientation-encoded descriptor improve 3D point cloud segmentation when integrated into PointNet-based architectures?
  • RQ2Does multi-scale, orientation-awareDescriptor learning enhance robustness to varying object and scene scales in 3D data?
  • RQ3Do FP-shortcuts and PointSIFT modules lead to faster convergence and higher segmentation accuracy compared to standard PointNet++ pipelines?
  • RQ4How does PointSIFT perform on standard 3D semantic segmentation benchmarks (ScanNet, S3DIS) relative to state-of-the-art methods?

Key findings

  • PointSIFT outperforms state-of-the-art methods on ScanNet and S3DIS benchmarks, including a relative 8.4% mean IoU improvement on ScanNet and 12% improvement on S3DIS.
  • OE units effectively encode eight orientations and, when stacked, provide multi-scale local descriptors that improve segmentation performance.
  • Inserting PointSIFT modules between SA and FP layers with FP-shortcuts accelerates convergence and preserves low-level information, improving overall accuracy.
  • Compared to baselines like PointNet++, PointSIFT-enhanced networks achieve higher per-voxel accuracy on ScanNet (86.2% accuracy, 41.5 IoU mean) and higher overall accuracy and mean IoU on S3DIS (88.72% and 70.23%, respectively).
  • A toy-scale-awareness experiment shows ~89% of activations align with the input shape scale, indicating the model learns scale-aware representations.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.