QUICK REVIEW

[논문 리뷰] Text-to-3D using Gaussian Splatting

Zilong Chen, Feng Wang|arXiv (Cornell University)|2023. 09. 28.

Image Processing and 3D Reconstruction인용 수 8

한 줄 요약

Gsgen은 3D Gaussian Splatting과 3D 포인트-클라우드 확산 priors에 의해 안내되는 2단계 최적화( 기하학 먼저, 외관 다음 )를 사용하여 텍스트 프롬프트에서 고품질의 3D 일관 자산을 생성합니다.

ABSTRACT

Automatic text-to-3D generation that combines Score Distillation Sampling (SDS) with the optimization of volume rendering has achieved remarkable progress in synthesizing realistic 3D objects. Yet most existing text-to-3D methods by SDS and volume rendering suffer from inaccurate geometry, e.g., the Janus issue, since it is hard to explicitly integrate 3D priors into implicit 3D representations. Besides, it is usually time-consuming for them to generate elaborate 3D models with rich colors. In response, this paper proposes GSGEN, a novel method that adopts Gaussian Splatting, a recent state-of-the-art representation, to text-to-3D generation. GSGEN aims at generating high-quality 3D objects and addressing existing shortcomings by exploiting the explicit nature of Gaussian Splatting that enables the incorporation of 3D prior. Specifically, our method adopts a progressive optimization strategy, which includes a geometry optimization stage and an appearance refinement stage. In geometry optimization, a coarse representation is established under 3D point cloud diffusion prior along with the ordinary 2D SDS optimization, ensuring a sensible and 3D-consistent rough shape. Subsequently, the obtained Gaussians undergo an iterative appearance refinement to enrich texture details. In this stage, we increase the number of Gaussians by compactness-based densification to enhance continuity and improve fidelity. With these designs, our approach can generate 3D assets with delicate details and accurate geometry. Extensive evaluations demonstrate the effectiveness of our method, especially for capturing high-frequency components. Our code is available at https://github.com/gsgen3d/gsgen

연구 동기 및 목표

명시적 3D 표현을 활용하여 priors를 통합할 수 있도록 텍스트-투-3D 생성을 개선하려는 동기 부여.
2단계 최적화를 통해 정확한 기하학과 고충실도 외관 달성.
3D 일관성과 디테일을 강화하기 위한 실용적인 초기화 및 밀도화 전략 제시

제안 방법

3D 가우시안 세트를 사용하여 3D 장면을 표현하고 이를 점진적으로 최적화합니다.
3D 포인트-클라우드 확산 priors와 2D SDS 손실에 의해 안내되는 기하학 최적화 단계를 통해 거칠고 3D-일관된 모양을 얻습니다.
밀도화 기준에 기반한 반복적 밀도화로 세부를 풍부하게 하여 외관 최적화의 두 번째 단계에서 외관을 정제합니다.
degeneration을 피하기 위해 Point-E의 3D 모양 또는 사용자가 제공한 기하학으로 Gaussian을 초기화합니다.
기하학 최적화 동안 Gaussian 위치에 3D SDS 손실을 적용하여 3D priors를 통합합니다.
정제 단계에서 2D SDS 가이던스에 의존하고 밀도화 기반의 밀집화를 도입하여 연속성과 충실도를 향상시킵니다

Figure 1: Delicate 3D assets generated using the proposed Gsgen . See our project page gsgen3d.github.io for videos of these images.

실험 결과

연구 질문

RQ1명시적 3D priors로 안내될 때 3D Gaussian Splatting이 텍스트-투-3D 생성에 효과적인 표현으로 작동할 수 있는가?
RQ23D priors를 가진 기하학과 밀도화를 통한 외관 정제라는 두 단계 최적화가 2D 가이던스만 사용하는 경우보다 기하학과 디테일을 더 우수하게 만드는가?
RQ3Janus 문제 및 텍스트-투-3D 생성의 과도한 매끄러짐을 완화하는 초기화 및 밀도화 전략은 무엇인가?
RQ4Point-E priors를 도입하는 것이 기하학적 일관성과 최종 시각 충실도에 어떤 영향을 미치는가?

주요 결과

Gsgen은 기존 방법에 비해 더 정확한 기하학과 섬세한 디테일의 3D 자산을 생성합니다.
Point-E 가이던스를 통한 3D priors의 도입은 기하학 붕괴를 완화하고 다중 시야 일관성을 개선하는 데 도움이 됩니다.
밀도화 기반의 밀도화는 외관 정제 동안 기하학 연속성과 디테일을 향상시킵니다.
Point-E priors 및 3D 가이던스로 초기화하는 것이 무작위 초기화나 2D 가이던스만 사용하는 것보다 우수한 결과를 낳습니다.
이 방법은 텍스처나 털과 같은 고주파 구성 요소를 여러 기준선보다 더 잘 포착합니다.

Stable DreamFusion (Tang, 2022 ; Poole et al., 2023 )

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.