QUICK REVIEW

[论文解读] Canonical Capsules: Self-Supervised Capsules in Canonical Pose

Weiwei Sun, Andrea Tagliasacchi|arXiv (Cornell University)|Dec 8, 2020

3D Shape Modeling and Analysis被引用 36

一句话总结

提出一种自监督的3D点云胶囊架构，在不需要标签的情况下学习一个规范框架和语义一致的部件分解，从而实现更好的重建、规范化和无监督分类。

ABSTRACT

We propose a self-supervised capsule architecture for 3D point clouds. We compute capsule decompositions of objects through permutation-equivariant attention, and self-supervise the process by training with pairs of randomly rotated objects. Our key idea is to aggregate the attention masks into semantic keypoints, and use these to supervise a decomposition that satisfies the capsule invariance/equivariance properties. This not only enables the training of a semantically consistent decomposition, but also allows us to learn a canonicalization operation that enables object-centric reasoning. To train our neural network we require neither classification labels nor manually-aligned training datasets. Yet, by learning an object-centric representation in a self-supervised manner, our method outperforms the state-of-the-art on 3D point cloud reconstruction, canonicalization, and unsupervised classification.

研究动机与目标

激发对没有事先对齐数据集的3D点云进行无监督学习的兴趣。
通过注意力开发置换等变的胶囊分解。
学习一个规范框架以实现对象中心的推理。
使用随机旋转对象的Siamese对进行端到端训练。
展示在3D方面的自编码、规范化和分类的最先进性能。

提出的方法

通过置换等变胶囊编码器E对点云进行K部分分解。
聚合注意力掩码以获得K个胶囊位姿theta_k和描述子beta_k（方程2）。
使用网络K从描述子回归规范胶囊位姿以获得 bar{theta}；强制局部性。
通过求解刚性对齐将学习到的规范关键点bar{theta}与预测位姿对齐来规范化（方程5）。
在规范坐标系中对每个胶囊解码点云并重建输入（方程4），使用Chamfer距离作为重建损失（方程11）。
使用随机旋转/平移形状的Siamese对进行训练以强化等变性/不变性（损失项 L_equivariance、L_invariance、L_equilibrium、L_localization、L_canonical）。

实验结果

研究问题

RQ1自我监督的胶囊分解能否在未对齐的3D点云中产生语义上一致的部件？
RQ2学习规范框架是否提升用于重建和下游任务的面向对象的表示？
RQ3所提出的规范化如何与变换不变性/等变性相互作用，以实现无监督分类？
RQ4各种损失项对重建质量和规范化稳定性有何影响？

主要发现

方法	Airplane (Aligned)	Chair (Aligned)	Multi (Aligned)	Airplane (Unaligned)	Chair (Unaligned)	Multi (Unaligned)
3D-PointCapsNet [64]	1.94	3.30	2.49	5.58	7.57	4.66
AtlasNetV2 [12]	1.28	2.36	2.14	2.80	3.98	3.08
Our method	0.96	1.99	1.76	1.11	2.58	2.22

在对齐和未对齐的ShapeNet数据上实现基于Chamfer距离的最先进自编码（例如，我们的方法：Airplane/Chair/Multi 对齐为 0.96, 1.99, 1.76；未对齐为 1.11, 2.58, 2.22）。
学习一个学习得到的规范框架，使语义一致的分解和改进的重建细节成为可能（例如翼部、引擎）。
在规范化和成对配准方面表现具有竞争力/强劲，稳定性指标(mStd)优于若干基线。
通过Canonical Capsules学习到的特征在无监督分类上表现更强（top-1 精度：对齐 94.21% SVM；未对齐 87.33% SVM）。
消融分析表明等变性/不变性/规范损失对维持重建和规范化质量具有关键作用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。