QUICK REVIEW

[论文解读] ATISS: Autoregressive Transformers for Indoor Scene Synthesis

Despoina Paschalidou, Amlan Kar|arXiv (Cornell University)|Oct 7, 2021

3D Surveying and Cultural Heritage参考文献 78被引用 47

一句话总结

ATISS 展示了一种自回归变换器，它将室内房间布局生成为对象的无序集合，能够实现交互式场景补全和对象建议，相比现有方法具有更快的运行时间和更少的参数。

ABSTRACT

The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation. In this paper, we present ATISS, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments, given only the room type and its floor plan. In contrast to prior work, which poses scene synthesis as sequence generation, our model generates rooms as unordered sets of objects. We argue that this formulation is more natural, as it makes ATISS generally useful beyond fully automatic room layout synthesis. For example, the same trained model can be used in interactive applications for general scene completion, partial room re-arrangement with any objects specified by the user, as well as object suggestions for any partial room. To enable this, our model leverages the permutation equivariance of the transformer when conditioning on the partial scene, and is trained to be permutation-invariant across object orderings. Our model is trained end-to-end as an autoregressive generative model using only labeled 3D bounding boxes as supervision. Evaluations on four room types in the 3D-FRONT dataset demonstrate that our model consistently generates plausible room layouts that are more realistic than existing methods. In addition, it has fewer parameters, is simpler to implement and train and runs up to 8 times faster than existing methods.

研究动机与目标

开发一个模型，仅根据房间类型和平面布局来合成真实的室内家具布局。
将场景表示为对象的无序集合，以实现交互式编辑和补全。
训练一个自回归变换器，使其对对象顺序的排列具有置换不变性，仅使用3D边界框标签。
证明该模型在多种房间类型上能产生可信的布局，并且在真实感和效率方面优于基线方法。

提出的方法

将场景生成公式化为房间内对象的无序集合生成。
使用以 floor layout 特征和每个对象上下文嵌入为条件的自回归变换器编码器。
用逻辑分布混合来建模对象属性（类别、大小、位置、朝向），并自回归地预测它们（先类别，再大小、位置、朝向）。
通过蒙特卡洛采样在对象顺序的所有置换下最大化对数似然，以促使顺序不变性。
引入一个可学习的查询向量来预测下一个对象，以及一个结束符号来终止生成。
推理阶段，从空上下文开始，迭代地为每个新对象采样属性，直到生成结束符号为止。

实验结果

研究问题

RQ1当将对象视为无序集合时，是否可以用自回归变换器模型生成多样且可信的室内房间布局？
RQ2相对于有序序列方法，置换不变性训练是否提升在交互任务（如场景补全和对象建议）中的表现？
RQ3在真实感、多样性和计算效率方面，ATISS 在多种房间类型上的表现与现有方法相比如何？
RQ4单一训练模型是否能够支持部分房间重排和用户约束的对象放置等交互应用？

主要发现

ATISS 生成卧室、起居室、餐厅和图书馆场景中的可信且多样的室内布局。
该模型在3D-FRONT数据上相比 FastSynth 和 SceneFormer 获得更低的 FID 分数和更符合对象类别分布的效果。
ATISS 运行速度最高可比强基线快8倍，参数更少，同时在主观感知研究中提升真实感。
无序集合的形式便于完成场景、异常检测和带约束的用户引导对象建议等交互任务。
定性与定量结果显示出高度可信度以及生成过程对对象顺序的不变性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。