QUICK REVIEW

[論文レビュー] Depth-supervised NeRF: Fewer Views and Faster Training for Free

Kangle Deng, Andrew Liu|arXiv (Cornell University)|Jul 6, 2021

Advanced Vision and Imaging参考文献 38被引用数 45

ひとこと要約

Depth-supervised NeRF (DS-NeRF) は structure-from-motion からのスパース深度を用いて NeRF を監視監督し、ビュー数を減らしてもより良いビュー合成を実現し、トレーニングを 2–3 倍高速化します。

ABSTRACT

A commonly observed failure mode of Neural Radiance Field (NeRF) is fitting incorrect geometries when given an insufficient number of input views. One potential reason is that standard volumetric rendering does not enforce the constraint that most of a scene's geometry consist of empty space and opaque surfaces. We formalize the above assumption through DS-NeRF (Depth-supervised Neural Radiance Fields), a loss for learning radiance fields that takes advantage of readily-available depth supervision. We leverage the fact that current NeRF pipelines require images with known camera poses that are typically estimated by running structure-from-motion (SFM). Crucially, SFM also produces sparse 3D points that can be used as "free" depth supervision during training: we add a loss to encourage the distribution of a ray's terminating depth matches a given 3D keypoint, incorporating depth uncertainty. DS-NeRF can render better images given fewer training views while training 2-3x faster. Further, we show that our loss is compatible with other recently proposed NeRF methods, demonstrating that depth is a cheap and easily digestible supervisory signal. And finally, we find that DS-NeRF can support other types of depth supervision such as scanned depth sensors and RGB-D reconstruction outputs.

研究の動機と目的

希薄なビュー下での NeRF の学習を動機づけ、強い事前情報がなければ誤ったジオメトリにはまりやすい傾向がある点を指摘する。
SFM（COLMAP）によって生成されるスパースな 3D キーポイントから導出される深度ベースの監督信号を導入する。
深度監督がジオメトリとレンダリング品質を向上させつつ、 training time を短縮できることを示す。
深度監督損失が他の NeRF 手法および深度ソース（RGB-D および深度センサー）と互換性があることを示す。

提案手法

COLMAP キーポイントに NeRF のレイ終端深度を固定する深度監督損失を定義し、深度不確実性を考慮する。
レイ終端分布 h(t) を T(t)σ(t) としてモデル化し、レイ深度の確率分布に近似することを示す。
L_Depth を導出し、深度誘導分布と NeRF のレイ終端分布との間の KL 発散を最小化する。 depth samples の和で近似。
カラー損失 L_Color と L_Depth を結合して L = L_Color + λ_D L_Depth という結合目的関数とする。
深度監督は補完的で、既存の NeRF ベースの手法および RGB-D 拡張へ組み込むことができることを示す。

Figure 2 : Few view NeRF. NeRF is susceptible to overfitting when given few training views. As seen by the PSNR gap between train and test renders (left), NeRF has overfit and fails at synthesizing novel views. Further, the depth map (right) and depth error (middle) for NeRF suggest that its density

実験結果

リサーチクエスチョン

RQ1COLMAP のスパースなキーポイントからの深度監督は、少数のビューで NeRF の性能を向上させるか？
RQ2DS-NeRF は、レンダリング品質を維持または向上させつつ、トレーニングを加速できるか？
RQ3深度監督は他の NeRF バリアントや深度ソース（RGB-D、深度センサ）と互換性があるか？
RQ4テストビューでの深度推定精度に対する深度監督の影響はどうなるか？

主な発見

DS-NeRF は学習ビューを減らしてもより良い画像をレンダリングでき、ベーシックな NeRF より 2–3 倍速く学習できる。
深度監督はテストビューでの深度誤差を減らし、ジオメトリ回復の改善を示す。
KL 発散に基づく深度損失は、MSE ベースの深度監督よりもアーチファクトが少ないことが多い。
RGB-D または RGB-D 由来の深度監督を用いても DS-NeRF は有効で、より密な深度事前分布を可能にする。
DS-NeRF およびその派生は、NeRF Real、DTU、Redwood データセットにおいて、少数ビュー設定でベースラインを上回る。

Figure 3 : Ray Termination Distribution. (a) We plot various NeRF components over the distance traveled by the ray. Even if a ray traverses through multiple objects (as indicated by the multiple peaks of density $\sigma(t)$ ), we find that the ray termination distribution $h(t)$ is still unimodal. W

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。