QUICK REVIEW

[論文レビュー] SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections

Mark Boss, Andreas Engelhardt|arXiv (Cornell University)|May 31, 2022

Advanced Vision and Imaging被引用数 38

ひとこと要約

SAMURAI は、完璧なポーズやマスクを必要とせず、制約のない野外画像コレクションから3D形状、BRDF、各画像のカメラ姿勢、および照明を共同最適化して、再照明可能な3D資産とメッシュを生成します。

ABSTRACT

Inverse rendering of an object under entirely unknown capture conditions is a fundamental challenge in computer vision and graphics. Neural approaches such as NeRF have achieved photorealistic results on novel view synthesis, but they require known camera poses. Solving this problem with unknown camera poses is highly challenging as it requires joint optimization over shape, radiance, and pose. This problem is exacerbated when the input images are captured in the wild with varying backgrounds and illuminations. Standard pose estimation techniques fail in such image collections in the wild due to very few estimated correspondences across images. Furthermore, NeRF cannot relight a scene under any illumination, as it operates on radiance (the product of reflectance and illumination). We propose a joint optimization framework to estimate the shape, BRDF, and per-image camera pose and illumination. Our method works on in-the-wild online image collections of an object and produces relightable 3D assets for several use-cases such as AR/VR. To our knowledge, our method is the first to tackle this severely unconstrained task with minimal user interaction. Project page: https://markboss.me/publication/2022-samurai/ Video: https://youtu.be/LlYuGDjXp-8

研究の動機と目的

固定カメラ内部パラメータやクリーンなセグメンテーションが欠如した制約のない実世界の画像コレクションから、3D形状と材質の再構成を動機づける。
形状、BRDF、各画像の照明、および各画像のカメラ姿勢/内部パラメータを推定する共同最適化フレームワークの開発。
堅牢な初期化、カメラのマルチプレックス、画像後処理のスケーリングを導入することで、完璧なポーズ/マスク入力への依存を緩和。
AR/VRおよび材質編集アプリケーション向けに、BRDFテクスチャを持つ明示的なメッシュの抽出を可能にする。

提案手法

各3D位置での3D形状とBRDFを表現するNeural-PIL/NeRF風のニューラルボリュームを基礎として、各画像の照明埋め込みを用いる。
距離変化に対応するため、look-at方式と各画像の焦点距離を用いた柔軟なオブジェクト中心のカメラパラメータ化を画像ごとに共同最適化する。
カメラ・マルチプレックスの導入: 画像ごとに複数のポーズを動的に重み付けされた損失で最適化し、局所解を回避する。
最適化中にノイズのあるマスクや画像の重みを下げるため、入力画像の事後スケーリングを使用する。
BRDFと照明推定を安定化させるために、粗→細の損失スケジューリング、フーリエ周波数退火、正則化を適用する。
学習済みニューラルボリュームからBRDFテクスチャを持つ明示的なメッシュを抽出し、后続のグラフィックス用途に使用する。

実験結果

リサーチクエスチョン

RQ1制約のない実世界の画像コレクションから、3D形状、BRDF、各画像の照明、およびカメラパラメータを共同推定できるか？
RQ2ポーズが荒い/未知で、マスクがノイズのある場合、共同最適化の性能はどうなるか？
RQ3ニューラルボリュームベースの再構成において、単一カメラの最適化よりもカメラ・マルチプレックス戦略が収束と精度を改善するか？
RQ4得られたモデルはAR/VRアプリケーションに適した再照明とメッシュ抽出をサポートできるか？

主な発見

SAMURAIは、正確なポーズ初期化がなくてもBARF-Aやベースラインと比較して、野外データセットでの新規ビュー合成と再照明を大幅に改善する。
It jointly estimates per-image illumination, BRDF parameters, and camera poses, enabling relightable 3D assets without perfect masks or poses.
Camera multiplexing with dynamic loss reweighting helps escape local minima and stabilizes optimization in challenging datasets.
Posterior image scaling and a robust optimization schedule improve reconstruction quality and robustness to noisy masks and images.
Explicit mesh extraction with BRDFs from the learned neural volume provides usable assets for AR/VR and material editing.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。