QUICK REVIEW

[论文解读] NPNet: A Non-Parametric Network with Adaptive Gaussian-Fourier Positional Encoding for 3D Classification and Segmentation

Mohammad Saeid, Amir Salarpour|arXiv (Cornell University)|Jan 31, 2026

3D Shape Modeling and Analysis被引用 0

一句话总结

NPNet 是一个用于三维点云分类与分割的完全非参数框架，采用自适应高斯-傅里叶位置编码和记忆库推理，在高效性与强小样本性能方面具有竞争力，同时无需学习权重。

ABSTRACT

We present NPNet, a fully non-parametric approach for 3D point-cloud classification and part segmentation. NPNet contains no learned weights; instead, it builds point features using deterministic operators such as farthest point sampling, k-nearest neighbors, and pooling. Our key idea is an adaptive Gaussian-Fourier positional encoding whose bandwidth and Gaussian-cosine mixing are chosen from the input geometry, helping the method remain stable across different scales and sampling densities. For segmentation, we additionally incorporate fixed-frequency Fourier features to provide global context alongside the adaptive encoding. Across ModelNet40/ModelNet-R, ScanObjectNN, and ShapeNetPart, NPNet achieves strong performance among non-parametric baselines, and it is particularly effective in few-shot settings on ModelNet40. NPNet also offers favorable memory use and inference time compared to prior non-parametric methods

研究动机与目标

Develop a training-free non-parametric architecture for 3D point-cloud classification and segmentation.
Introduce adaptive Gaussian–Fourier positional encoding that adapts to input geometry.
Augment segmentation with fixed-frequency Fourier features to provide global context.
Demonstrate competitive accuracy and efficiency against non-parametric baselines and competitiveness with parametric models.
Assess few-shot performance and deployment implications of a training-free pipeline.

提出的方法

Use deterministic geometric operators (farthest point sampling, k-NN grouping, pooling) to build multi-scale point features without learned weights.
Propose adaptive Gaussian–Fourier encoding that selects bandwidth and Gaussian–cosine mixing from input statistics (sigma_g) with blending parameter lambda.
For segmentation, add fixed-frequency Fourier features to form a hybrid position encoding for global context.
Encode training shapes into a memory bank and perform similarity-based inference for classification; for segmentation, use part prototypes and nearest-prototype matching.
Inference is memory-bank based and training-free: build banks once, then query with nearest-prototype style matching.

Figure 2 : Adaptive Gaussian–Fourier positional encoding. The encoding adapts bandwidth $\sigma$ and mixing coefficient $\lambda$ from input geometry; an additional fixed-frequency Fourier branch provides global context for segmentation.

实验结果

研究问题

RQ1Can a fully non-parametric pipeline match or exceed parametric methods on standard 3D point-cloud benchmarks?
RQ2Does an input-adaptive Gaussian–Fourier positional encoding improve stability and transfer across varying densities and scales?
RQ3What is the impact of fixed-frequency Fourier features on segmentation performance and global context?
RQ4What are the memory, time, and computational trade-offs of NPNet compared to prior non-parametric methods and parametric networks, especially in few-shot settings?

主要发现

On ModelNet40, NPNet achieves 85.45% accuracy with 0.0M parameters and 0.0 GFLOPs.
On ModelNet-R, NPNet achieves 85.65% accuracy with 0.0M parameters and 0.0 GFLOPs.
On ScanObjectNN, NPNet attains 86.1% OBJ-BG, 86.1% OBJ-ONLY, and 84.9% PB-T50-RS (non-parametric baseline leadership on OBJ-BG and OBJ-ONLY).
On ShapeNetPart, NPNet achieves 73.56% instance mIoU with the hybrid encoding.
In few-shot ModelNet40, NPNet achieves 92.0% (5-way 10-shot) and 93.2% (5-way 20-shot); 82.5% (10-way 10-shot) and 87.6% (10-way 20-shot).
Efficiency figures show NPNet with ModelNet40: 0.0021 GFLOPs, 99.1 MB memory, 3.86 ms/sample; ShapeNetPart: 0.0045 GFLOPs, 256.4 MB, 5.63 ms/sample.

Figure 3 : Stage block used in NPNet. FPS selects centroids, $k$ -NN groups local neighborhoods, positional encoding modulates features, and mean/max pooling produces a stage descriptor; concatenating stages forms a multi-scale representation.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。