QUICK REVIEW

[論文レビュー] Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation

Rui Chen, Yongwei Chen|arXiv (Cornell University)|Mar 24, 2023

Human Motion and Animation被引用数 14

ひとこと要約

Fantasia3DはDMTetベースのハイブリッド幾何表現と空間的に変化するBRDF外観モデルを用いて、幾何と外観を分離することでテキストから3D生成の高品質な幾何とフォトリアルなレンダリングを実現します。レンダリングされた法線マップを形状入力として事前学習済みの拡散モデルに活用し、現実的な素材のBRDFパラメータを学習します。

ABSTRACT

Automatic 3D content creation has achieved rapid progress recently due to the availability of pre-trained, large language models and image diffusion models, forming the emerging topic of text-to-3D content creation. Existing text-to-3D methods commonly use implicit scene representations, which couple the geometry and appearance via volume rendering and are suboptimal in terms of recovering finer geometries and achieving photorealistic rendering; consequently, they are less effective for generating high-quality 3D assets. In this work, we propose a new method of Fantasia3D for high-quality text-to-3D content creation. Key to Fantasia3D is the disentangled modeling and learning of geometry and appearance. For geometry learning, we rely on a hybrid scene representation, and propose to encode surface normal extracted from the representation as the input of the image diffusion model. For appearance modeling, we introduce the spatially varying bidirectional reflectance distribution function (BRDF) into the text-to-3D task, and learn the surface material for photorealistic rendering of the generated surface. Our disentangled framework is more compatible with popular graphics engines, supporting relighting, editing, and physical simulation of the generated 3D assets. We conduct thorough experiments that show the advantages of our method over existing ones under different text-to-3D task settings. Project page and source codes: https://fantasia3d.github.io/.

研究の動機と目的

テキストプロンプトから自動的に3D資産を作成し、表面品質と素材を向上させることを動機づける。
幾何と外観の学習を分離して、微細な幾何とフォトリアリスティックな質感をよりよく回復できるようにする。
ハイブリッドな表面表現（DMTet）を活用して、明示的な表面変形と微分可能なレンダリングを可能にする。
現実的な表面材料を学習するための空間的に変化するBRDFモデルを導入する。
リライティング、編集、物理シミュレーションのためのグラフィックスエンジンとの互換性を確保する。

提案手法

Deom作成のハイブリッド幾何表現としてDMTetを用い、変形可能な四面体グリッドと微分可能なメッシュ抽出を行う。
表面法線マップ（およびオブジェクトマスクを初期段階で）を、SDS損失を介して事前学習済み画像拡散モデルへの形状入力としてレンダリングしてエンコードする。
物理ベースレンダリングのために、拡散、粗さ/金属度、法線変化項を出力するMLPで学習されるBRDFベースの外観モデルを導入する。
事前学習済みStable Diffusionモデルを用いたSDSによって幾何と外観モデルを訓練する。
幾何を3D楕円体またはユーザー提供形状から初期化し、粗→細の段階を経て幾何とテクスチャの最適化を繰り返して改良する。
シームを減らしレンダリングの現実感を高めるためのUVエッジパディングを持つテクスチャマッピングパイプラインを提供する。

実験結果

リサーチクエスチョン

RQ1分離された幾何-外観学習は、結合型またはNeRFベースのアプローチと比較してテキストから3D資産の品質を向上させることができるか。
RQ2空間的に変化するBRDFを組み込むことで、生成表面のフォトリアリスティックなレンダリングと材料忠実度の向上を実現できるか。
RQ3法線マップに基づく形状エンコードを拡散モデルへの入力とすることで、色ベースのエンコードよりも幾何回復を向上させることができるか。
RQ4標準的なグラフィックスエンジンでの編集、リライティング、物理シミュレーションへの適合性はあるか。

主な発見

Fantasia3Dはゼロショットおよびユーザーガイド設定の両方で、幾何品質と外観の現実性の両方で既存手法を上回る。
DMTetを用いた幾何と外観の分離学習により、BRDF素材を介した微細な表面回復とフォトリアリスティックなレンダリングを実現する。
拡散ガイダンスの形状入力としてレンダリングされた法線マップを使用することで、色ベースの入力よりも幾何品質が向上する。
BRDFベースの外観モデリングは、拡散のみの代替案よりもより現実的な照明と反射を提供する。
この手法はBlenderのような標準グラフィックスエンジンでのリライティング、編集、物理シミュレーションをサポートする。
幾何はユーザー提供の形状または楕円体から初期化でき、柔軟なユーザー主導の生成を可能にする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。