QUICK REVIEW

[論文レビュー] SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis

Guangcong Wang, Zhaoxi Chen|arXiv (Cornell University)|Mar 28, 2023

Advanced Vision and Imaging被引用数 11

ひとこと要約

SparseNeRF は局所深度ランキングと粗い深度マップからの空間連続性蒸留を用いて少数ショットの NeRF を改善し、LLFF、DTU で最先端の結果を達成し、新しい NVS-RGBD データセットを用いて推論時間を増やすことなく達成します。

ABSTRACT

Neural Radiance Field (NeRF) significantly degrades when only a limited number of views are available. To complement the lack of 3D information, depth-based models, such as DSNeRF and MonoSDF, explicitly assume the availability of accurate depth maps of multiple views. They linearly scale the accurate depth maps as supervision to guide the predicted depth of few-shot NeRFs. However, accurate depth maps are difficult and expensive to capture due to wide-range depth distances in the wild. In this work, we present a new Sparse-view NeRF (SparseNeRF) framework that exploits depth priors from real-world inaccurate observations. The inaccurate depth observations are either from pre-trained depth models or coarse depth maps of consumer-level depth sensors. Since coarse depth maps are not strictly scaled to the ground-truth depth maps, we propose a simple yet effective constraint, a local depth ranking method, on NeRFs such that the expected depth ranking of the NeRF is consistent with that of the coarse depth maps in local patches. To preserve the spatial continuity of the estimated depth of NeRF, we further propose a spatial continuity constraint to encourage the consistency of the expected depth continuity of NeRF with coarse depth maps. Surprisingly, with simple depth ranking constraints, SparseNeRF outperforms all state-of-the-art few-shot NeRF methods (including depth-based models) on standard LLFF and DTU datasets. Moreover, we collect a new dataset NVS-RGBD that contains real-world depth maps from Azure Kinect, ZED 2, and iPhone 13 Pro. Extensive experiments on NVS-RGBD dataset also validate the superiority and generalizability of SparseNeRF. Code and dataset are available at https://sparsenerf.github.io/.

研究の動機と目的

正 dense な多視点データが利用できない場合の頑健な少数ショット新規視点合成を動機づける。
正確な深度よりも、事前学習済み深度モデルや民生センサからの粗い深度 priors を活用する。
NeRF 学習を正則化するために深度ランキングと空間連続性蒸留を導入する。
これらの priors が標準ベンチマークと新しいデータセットでジオメトリとレンダリング品質を改善することを示す。

提案手法

カラ―再構成損失で訓練されたベース NeRF バックボーン（Mip-NeRF）です。
事前学習済み深度モデル（例: DPT）や粗いセンサ深度から蒸留された深度 priors。
局所パッチ内で NeRF の深度順位が粗い深度の順位と一致するよう蒸留を行い、ランキング損失（式3）を用いる。
空間連続性蒸留: 粗い深度マップの局所的な深度連続性を NeRF の深度連続性に反映するよう蒸留を行う（式4）。
全目的関数: L = L_nerf + lambda * R_rank + gamma * R_conti を事前設定されたマージンと重みで。

実験結果

リサーチクエスチョン

RQ1正確な深度監視に頼らずに、粗い深度マップからの頑健な深度 priors は少数ショット NeRF を改善できるか？
RQ2局所的な深度ランキングだけで、NeRF 訓練時の深度のスケーリングを上回るか？
RQ3空間連続性蒸留を取り入れると、ビュー間の幾何的整合性が向上するか？
RQ4これらの priors は LLFF、DTU、および新しい NVS-RGBD データセットで、異なる事前学習深度モデルとともにどのように機能するか？

主な発見

Setting	PSNR ↑	SSIM ↑	LPIPS ↓
SRF	12.34	0.250	0.591
PixelNeRF	P	7.93	0.272	0.682
MVSNeRF	17.25	0.557	0.356
SRF ft		17.07	0.436	0.529
PixelNeRF ft	P&FT	16.17	0.438	0.512
MVSNeRF ft		17.88	0.584	0.327
Mip-NeRF		14.62	0.351	0.495
DietNeRF	geo. & sem.	14.94	0.370	0.496
RegNeRF		19.08	0.587	0.336
MonoSDF*	depth KD	18.45	0.565	0.388
DSNeRF		18.94	0.582	0.362
SparseNeRF (Ours)		19.86	0.624	0.328

SparseNeRF は LLFF および DTU において、PSNR、SSIM、LPIPS で少数ショット NeRF 手法の中で最先端の性能を達成。
LLFF の3ビューで、SparseNeRF は PSNR 19.86、SSIM 0.624、LPIPS 0.328 を達成（RegNeRF は 19.08/0.587/0.336）。
DTU の3ビューで、SparseNeRF は PSNR 19.55、SSIM 0.769、LPIPS 0.201（RegNeRF は 18.89/0.745/0.190）。
新しい NVS-RGBD データセットでは、Kinect および ZED 2 センサで RegNeRF、DSNeRF、MonoSDF を上回り、PSNR が高く、SSIM が 0.80 以上、LPIPS が低く、深度誤差も低い。
深度ランキング蒸留と空間連続性蒸留は、ベースラインよりジオメトリと 3D 一貫性の向上に寄与（アブレーション研究で、ランキングまたは連続性が欠如すると性能が低下）。
異なる事前学習深度モデル（MiDaS、DPT Hybrid/Large）を使用すると、ベースラインより一貫して結果が改善され、DPT 系が最も良い性能を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。