QUICK REVIEW

[论文解读] 3DILG: Irregular Latent Grids for 3D Generative Modeling

Biao Zhang, Matthias Nießner|arXiv (Cornell University)|May 27, 2022

3D Shape Modeling and Analysis被引用 22

一句话总结

引入不规则潜在网格用于神经场以实现可扩展、与 Transformer 兼容的三维形状重建和概率生成，在重建和多种条件生成任务中达到最先进的结果。

ABSTRACT

We propose a new representation for encoding 3D shapes as neural fields. The representation is designed to be compatible with the transformer architecture and to benefit both shape reconstruction and shape generation. Existing works on neural fields are grid-based representations with latents defined on a regular grid. In contrast, we define latents on irregular grids, enabling our representation to be sparse and adaptive. In the context of shape reconstruction from point clouds, our shape representation built on irregular grids improves upon grid-based methods in terms of reconstruction accuracy. For shape generation, our representation promotes high-quality shape generation using auto-regressive probabilistic models. We show different applications that improve over the current state of the art. First, we show results for probabilistic shape reconstruction from a single higher resolution image. Second, we train a probabilistic model conditioned on very low resolution images. Third, we apply our model to category-conditioned generation. All probabilistic experiments confirm that we are able to generate detailed and high quality shapes to yield the new state of the art in generative 3D shape modeling.

研究动机与目标

为三维形状设计一种兼容 Transformer 架构并同时支持重建与生成的潜在表示。
开发稀疏、可自适应的不规则潜在网格，避免规则网格的局限性。
展示从点云到三维形状的重建改进，以及在多种条件制约下的最先进的概率性三维形状生成。

提出的方法

定义一组长度固定的潜在元组序列 (x_i, z_i)，其中 x_i 是三维位置，z_i 是不规则网格上的潜在向量。
通过类似 Mini-PointNet 的嵌入对点邻域的块进行处理，以生成第 i 个块的嵌入 e_i。
在 (e_i, 位置嵌入 p_i) 序列上使用 Transformer 来学习局部潜在向量 z_i。
可选地使用带有字典 D 的向量量化，将中间潜在向量离散化以用于自回归建模。
使用自回归或基于核的 Nadarya-Watson 估计对任意查询点 x 外插潜在向量 z_x，并通过一个多层感知机解码以获得占据概率 O(x)。
提供自回归和双向 Transformer 策略，在坐标或其他条件信号的条件下生成 z_i；支持单向逐 token 生成或分块的双向采样。

实验结果

研究问题

RQ1不规则潜在网格在从点云进行重建的准确性方面能否达到或超过基于网格的表示？
RQ2不规则潜在网格是否能够在受图像、类别标签或点云等条件约束下实现高质量的概率性三维形状生成？
RQ3在 3DILG 的自回归生成中，向量量化如何影响质量和可行性？
RQ4该模型是否能够在固定长度潜在序列和基于 Transformer 的建模下，支持多种条件模态（图像、类别、点云）？

主要发现

不规则潜在网格在 ShapeNet-v2 上在多个指标上获得与最先进方法相当的重建结果。
该模型支持在高分辨率或低分辨率图像、类别标签或点云等条件下进行概率性、多样本的三维形状生成，获得高质量的表面细节。
向量量化可能略微降低重建/生成性能，但提供对自回归建模有益的离散潜在空间。
使用不规则潜在变量的类别条件生成相较于网格基线 8^3 获得更有利的 FID 分数，显示生成形状的细节和多样性有所提高。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。