QUICK REVIEW

[论文解读] PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition

Haoxuan You, Yifan Feng|arXiv (Cornell University)|Aug 23, 2018

3D Shape Modeling and Analysis参考文献 30被引用 33

一句话总结

PVNet 是一种新颖的联合卷积网络，通过将点云与多视角表示相结合，实现三维形状识别，采用嵌入注意力融合机制，利用多视角数据中的高层全局特征来增强点云中局部结构特征的学习。该方法在 ModelNet40 数据集上，于三维形状分类与检索任务中均取得了最先进性能。

ABSTRACT

3D object recognition has attracted wide research attention in the field of multimedia and computer vision. With the recent proliferation of deep learning, various deep models with different representations have achieved the state-of-the-art performance. Among them, point cloud and multi-view based 3D shape representations are promising recently, and their corresponding deep models have shown significant performance on 3D shape recognition. However, there is little effort concentrating point cloud data and multi-view data for 3D shape representation, which is, in our consideration, beneficial and compensated to each other. In this paper, we propose the Point-View Network (PVNet), the first framework integrating both the point cloud and the multi-view data towards joint 3D shape recognition. More specifically, an embedding attention fusion scheme is proposed that could employ high-level features from the multi-view data to model the intrinsic correlation and discriminability of different structure features from the point cloud data. In particular, the discriminative descriptions are quantified and leveraged as the soft attention mask to further refine the structure feature of the 3D shape. We have evaluated the proposed method on the ModelNet40 dataset for 3D shape classification and retrieval tasks. Experimental results and comparisons with state-of-the-art methods demonstrate that our framework can achieve superior performance.

研究动机与目标

为解决现有三维形状识别模型将点云与多视角数据分别处理的局限性，尽管二者具有互补优势。
探究多视角网络中的高层全局特征如何提升基于点云模型的局部特征学习能力。
设计一种统一框架，联合利用两种表示以增强三维形状识别性能。
开发一种可学习的注意力机制，根据多视角输入的全局上下文自适应地加权局部结构特征。

提出的方法

该框架包含一个使用空间变换网络和 EdgeConv 的点云分支，用于从无序点云中提取局部几何特征。
一个多视角分支采用权重共享的卷积神经网络（MVCNN）并结合视图池化操作，从12个预定义相机视角生成全局特征。
一个嵌入网络将全局多视角特征投影到点云特征的子空间中，以实现跨模态融合。
一个注意力融合模块通过融合嵌入后的全局特征与局部点云特征，生成软注意力掩码，自适应地强调具有判别性的局部结构。
注意力掩码以残差方式应用于点云特征，以增强判别性并抑制无关特征。
两个分支的最终特征被拼接后输入全连接层，用于分类与检索任务。

实验结果

研究问题

RQ1多视角表示中的高层全局特征是否能改善基于点云的三维形状识别中的局部特征学习？
RQ2如何有效融合点云与多视角数据，以充分利用其在三维形状表征中的互补优势？
RQ3基于嵌入全局特征的注意力机制是否能增强局部点云特征的判别能力？
RQ4联合学习点云与多视角数据是否能在三维形状分类与检索任务中优于单模态方法？

主要发现

PVNet 在 ModelNet40 数据集上的三维形状分类任务中达到了最先进性能，优于现有的仅使用点云或仅使用多视角的模型。
所提出的嵌入注意力融合机制通过根据全局上下文自适应地加权局部结构特征，显著提升了特征的判别性。
消融实验验证了注意力融合机制以及点云与多视角数据的联合学习均对性能提升有贡献。
该方法在点云分支与多视角分支的不同主干网络架构下均表现出鲁棒性与泛化能力。
该框架在检索任务中也表现出优越性能，表明其在学习紧凑且具有判别性的三维形状表征方面具有有效性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。