QUICK REVIEW

[論文レビュー] ATISS: Autoregressive Transformers for Indoor Scene Synthesis

Despoina Paschalidou, Amlan Kar|arXiv (Cornell University)|Oct 7, 2021

3D Surveying and Cultural Heritage参考文献 78被引用数 47

ひとこと要約

ATISSは、物体の無序集合として室内部屋のレイアウトを生成する自己回帰型トランスフォーマーを提示し、従来法よりも高速な実行時間と少ないパラメータで、対話的なシーン完了と物体提案を可能にします。

ABSTRACT

The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation. In this paper, we present ATISS, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments, given only the room type and its floor plan. In contrast to prior work, which poses scene synthesis as sequence generation, our model generates rooms as unordered sets of objects. We argue that this formulation is more natural, as it makes ATISS generally useful beyond fully automatic room layout synthesis. For example, the same trained model can be used in interactive applications for general scene completion, partial room re-arrangement with any objects specified by the user, as well as object suggestions for any partial room. To enable this, our model leverages the permutation equivariance of the transformer when conditioning on the partial scene, and is trained to be permutation-invariant across object orderings. Our model is trained end-to-end as an autoregressive generative model using only labeled 3D bounding boxes as supervision. Evaluations on four room types in the 3D-FRONT dataset demonstrate that our model consistently generates plausible room layouts that are more realistic than existing methods. In addition, it has fewer parameters, is simpler to implement and train and runs up to 8 times faster than existing methods.

研究の動機と目的

部屋タイプと平面図のみに条件付けされた現実的な室内家具レイアウトを合成するモデルを開発する。
対話的な編集と補完を可能にするため、シーンを物体の無序集合として表現する。
3D境界ボックスラベルのみを用いて、物体の順序に対して置換不変となる自己回帰型トランスフォーマーを学習する。
複数の部屋タイプにわたって妥当なレイアウトを達成し、現実性と効率の面でベースラインを上回ることを示す。

提案手法

部屋内のオブジェクトの無序集合生成としてシーン生成を定式化する。
床レイアウト特徴と各オブジェクトのコンテキスト埋め込みを条件とする自己回帰型トランスフォーマーエンコーダを使用する。
オブジェクト属性（カテゴリ、サイズ、位置、向き）をロジスティック混合分布でモデリングし、自己回帰的に予測する（まずカテゴリ、次にサイズ、位置、向きを予測）。
モンテカルロサンプリングを用いて物体順序の全ての置換に跨る対数尤度を最大化するよう訓練し、順序不変性を促進する。
次のオブジェクトを予測する学習可能なクエリベクトルと、生成を終了させる終了記号を組み込む。
推論時には空のコンテキストから開始し、終了記号が出現するまで各新しいオブジェクトの属性を反復的にサンプリングする。

実験結果

リサーチクエスチョン

RQ1物体を無序集合として扱うとき、自己回帰型トランスフォーマーは多様で妥当な室内ルームレイアウトを生成できるのか？
RQ2順序不変な訓練は、シーン補完や物体提案のような対話的タスクのパフォーマンスを、順序付きシーケンスアプローチと比較して向上させるか？
RQ3複数の部屋タイプにわたって、現実性、多様性、および計算効率の点でATISSは既存手法とどう比較されるか？
RQ4単一の学習済みモデルが、部分的な部屋の再配置やユーザー制約されたオブジェクト配置などの対話アプリケーションをサポートできるか？

主な発見

ATISSは寝室、リビングルーム、ダイニングルーム、図書館のシーンにわたって、妥当で多様な室内レイアウトを生成する。
本モデルは、3D-FRONTデータでFastSynthおよびSceneFormerより低いFIDスコアとより忠実なオブジェクトカテゴリ分布を達成する。
ATISSは最も強力なベースラインより最大8倍速く、パラメータ数も少なく、知覚評価で現実性を改善する。
無序集合の定式化は、シーン補完、異常検知、制約付きのユーザー主導の物体提案などの対話的タスクを可能にする。
定性的・定量的結果は、高い妥当性と生成時の物体順序への不変性を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。