QUICK REVIEW

[논문 리뷰] ATISS: Autoregressive Transformers for Indoor Scene Synthesis

Despoina Paschalidou, Amlan Kar|arXiv (Cornell University)|2021. 10. 07.

3D Surveying and Cultural Heritage참고 문헌 78인용 수 47

한 줄 요약

ATISS는 무작위 순서의 객체 집합으로 실내 방 배치를 생성하는 자기회귀 변환기를 제시하며, 상호작용 가능한 장면 완성 및 객체 제안을 가능하게 하고 이전 방법들보다 더 빠른 실행 시간과 더 적은 파라미터를 제공합니다.

ABSTRACT

The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation. In this paper, we present ATISS, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments, given only the room type and its floor plan. In contrast to prior work, which poses scene synthesis as sequence generation, our model generates rooms as unordered sets of objects. We argue that this formulation is more natural, as it makes ATISS generally useful beyond fully automatic room layout synthesis. For example, the same trained model can be used in interactive applications for general scene completion, partial room re-arrangement with any objects specified by the user, as well as object suggestions for any partial room. To enable this, our model leverages the permutation equivariance of the transformer when conditioning on the partial scene, and is trained to be permutation-invariant across object orderings. Our model is trained end-to-end as an autoregressive generative model using only labeled 3D bounding boxes as supervision. Evaluations on four room types in the 3D-FRONT dataset demonstrate that our model consistently generates plausible room layouts that are more realistic than existing methods. In addition, it has fewer parameters, is simpler to implement and train and runs up to 8 times faster than existing methods.

연구 동기 및 목표

실내 유형과 평면도에만 조건적으로 현실적인 실내 가구 배치를 합성하는 모델을 개발한다.
장면을 객체의 비순서 집합으로 표현하여 인터랙티브한 편집과 완성을 가능하게 한다.
3D 경계 상자(label)만을 사용하여 객체 순서에 대해 순열 불변이 되도록 자기회귀 변환기를 학습한다.
여러 방 유형에 걸쳐 그럴듯한 배치를 달성하고 현실성 및 효율성 면에서 베이스라인보다 우수하다는 것을 시연한다.

제안 방법

실내 공간 내 객체의 비순서 집합 생성을 문제로 설정한다.
바닥 레이아웃 특징 및 각 객체의 맥락 임베딩에 조건화된 자기회귀 트랜스포머 인코더를 사용한다.
객체 속성(카테고리, 크기, 위치, 방향)을 로지스틱 분포의 혼합으로 모델링하고 자기회귀적으로 예측한다(먼저 카테고리, 그다음 크기/위치/방향).
몬테카를로 샘플링을 통해 모든 객체 순서의 순열에 걸친 로그가능도(log-likelihood)를 최대화하도록 학습하여 순서 불변성을 촉진한다.
다음 객체를 예측하기 위한 학습 가능한 질의 벡터와 생성을 종료하는 종료 기호를 도입한다.
추론 중에는 빈 컨텍스트에서 시작해 새로운 객체의 속성을 순차적으로 샘플링하다가 종료 기호가 생성될 때까지 반복한다.

실험 결과

연구 질문

RQ1객체를 비순서 집합으로 간주할 때 자기회귀 트랜스포머 모델이 다양하고 그럴듯한 실내 방 배치를 생성할 수 있는가?
RQ2순열 불변 학습이 장면 완성 및 객체 제안과 같은 인터랙티브 작업의 성능을 순서가 정해진 시퀀스 접근법과 비교해 향상시키는가?
RQ3다양한 방 유형에 걸쳐 현실성, 다양성 및 계산 효율성 면에서 ATISS가 기존 방법과 어떻게 비교되는가?
RQ4부분 방 재배치 및 사용자가 제약하는 객체 배치와 같은 인터랙티브 애플리케이션을 하나의 학습된 모델이 지원할 수 있는가?

주요 결과

ATISS는 침실, 거실, 식당 및 도서관 장면 전반에서 그럴듯하고 다양한 실내 배치를 생성한다.
모델은 3D-FRONT 데이터에서 FastSynth 및 SceneFormer보다 낮은 FID 점수와 더 신뢰할 수 있는 객체-카테고리 분포를 달성한다.
ATISS는 가장 강력한 기본 모델들보다 최대 8배 빠르고 파라미터 수가 적으면서 지각 연구에서 현실감을 향상시킨다.
비순서 집합 공식화는 제약이 있는 장면 완성, 이상 탐지 및 사용자가 이끄는 객체 제안을 포함한 인터랙티브 작업을 가능하게 한다.
정성적 및 정량적 결과는 생성 과정에서 높은 그럴듯함과 객체 순서에 대한 불변성을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.