QUICK REVIEW

[论文解读] Dynamic Graph CNN for Learning on Point Clouds

Yue Wang, Yongbin Sun|arXiv (Cornell University)|Jan 24, 2018

3D Shape Modeling and Analysis参考文献 83被引用 288

一句话总结

引入 EdgeConv，一种针对点云的动态基于图的卷积，在每一层更新邻居图，在 ModelNet40 和 ShapeNetPart 的分类与分割任务上取得了最先进的结果。

ABSTRACT

Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of adapting insight from CNN to the point cloud world. Point clouds inherently lack topological information so designing a model to recover topology can enrich the representation power of point clouds. To this end, we propose a new neural network module dubbed EdgeConv suitable for CNN-based high-level tasks on point clouds including classification and segmentation. EdgeConv acts on graphs dynamically computed in each layer of the network. It is differentiable and can be plugged into existing architectures. Compared to existing modules operating in extrinsic space or treating each point independently, EdgeConv has several appealing properties: It incorporates local neighborhood information; it can be stacked applied to learn global shape properties; and in multi-layer systems affinity in feature space captures semantic characteristics over potentially long distances in the original embedding. We show the performance of our model on standard benchmarks including ModelNet40, ShapeNetPart, and S3DIS.

研究动机与目标

Develop a neural network module that captures local geometric structure in unordered point clouds.
Ensure permutation invariance while leveraging neighborhood information.
Enable dynamic graph updates across layers to learn long-range semantic relationships.
Demonstrate EdgeConv’s applicability by achieving state-of-the-art results on benchmark datasets.
Provide analysis and open-source code to facilitate reproducibility.

提出的方法

Define EdgeConv as an edge-based feature aggregation where e_{ij} = h_{ heta}(x_i, x_j) and x_i' = Agg_{j:(i,j) in E} e_{ij}, with Agg being a symmetric function (e.g., max or sum).
Construct a k-NN graph G per layer in feature space and recompute neighbors after each EdgeConv layer (dynamic graph).
Use a learnable MLP h_{ heta} combining x_i and (x_j - x_i) to encode local geometry.
Maintain permutation invariance via symmetric pooling across neighborhood edges and optional self-loops.
Integrate EdgeConv into PointNet-like architectures to form a Deep Graph CNN (DGCNN) for classification and segmentation.
Compare EdgeConv variants and analyze effects of centralization, dynamic graphs, and number of neighbors.

实验结果

研究问题

RQ1How can local geometric structure of point clouds be captured without losing permutation invariance?
RQ2Does updating the neighborhood graph dynamically across layers improve learning of global and semantic structure in point clouds?
RQ3Can EdgeConv be integrated into existing point cloud pipelines to improve classification and segmentation performance?
RQ4What are the effects of graph hyperparameters (k, centralization) on performance and robustness to density variations?

主要发现

模型	平均类别准确率	整体准确率
3DShapeNets [Wu et al., 2015]	77.3	84.7
VoxNet [Maturana & Scherer, 2015]	83.0	85.9
Subvolume [Qi et al., 2016]	86.0	89.2
VRN (single view) [Brock et al., 2016]	-	-
VRN (multi-view) [Brock et al., 2016]	-	-
ECC [Simonovsky & Komodakis, 2017]	83.2	87.4
PointNet [Qi et al., 2017b]	86.0	89.2
PointNet++ [Qi et al., 2017c]	-	90.7
Kd-net [Klokov & Lempitsky, 2017]	-	90.6
PointCNN [Li et al., 2018a]	-	92.2
PCNN [Atzmon et al., 2018]	-	92.3
Ours (baseline)	88.9	91.7
Ours	90.2	92.9
Ours (2048 points)	90.7	93.5

EdgeConv effectively captures local geometry and, when stacked, learns global shape properties.
Dynamic graph updates per layer yield improved performance over fixed graphs, achieving state-of-the-art results on ModelNet40 (classification) and S3DIS (segmentation).
On ModelNet40, the authors achieve 90.2% mean class accuracy and 92.9% overall accuracy with 1024 points, and 90.7% / 93.5% with 2048 points.
Their baseline fixed-graph model outperforms PointNet++ by about 1.0% in mean accuracy, while the dynamic-graph version surpasses both PointNet++ and prior methods by larger margins.
The model demonstrates robustness to point dropouts and benefits from centralization and using more points, with performance gains when k is properly chosen (e.g., k=20–40).
The approach yields strong results on ShapeNet Part for part segmentation and scales to 3D shapes from various categories.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。