QUICK REVIEW

[論文レビュー] Animatable Neural Radiance Fields from Monocular RGB Videos

Jianchuan Chen, Y. Zhang|arXiv (Cornell University)|Jun 25, 2021

Human Pose and Action Recognition被引用数 46

ひとこと要約

本論文は SMPL ベースの姿勢ガイド変形を NeRF と統合し、モノラ RGB 動画からアニメーション可能な 3D 人間アバターを再構成・レンダリングします。SMPL パラメータと NeRF を共同最適化してディテールを向上させ、新規ポーズのアニメーションを実現します。

ABSTRACT

We present animatable neural radiance fields (animatable NeRF) for detailed human avatar creation from monocular videos. Our approach extends neural radiance fields (NeRF) to the dynamic scenes with human movements via introducing explicit pose-guided deformation while learning the scene representation network. In particular, we estimate the human pose for each frame and learn a constant canonical space for the detailed human template, which enables natural shape deformation from the observation space to the canonical space under the explicit control of the pose parameters. To compensate for inaccurate pose estimation, we introduce the pose refinement strategy that updates the initial pose during the learning process, which not only helps to learn more accurate human reconstruction but also accelerates the convergence. In experiments we show that the proposed approach achieves 1) implicit human geometry and appearance reconstruction with high-quality details, 2) photo-realistic rendering of the human from novel views, and 3) animation of the human with novel poses.

研究の動機と目的

モノクロRGB動画から高品質な 3D 人間再構成を、昂貴な機材を用いずに実現する。
canonical NeRF 空間を学習するための明示的な SMPL ガイド変形を提案し、詳細な人間テンプレートを得る。
NeRF パラメータと SMPL パラメータを共同最適化して収束と再構成精度を向上させる。
再現した人物の新規ビューレンダリングと新規ポーズアニメーションを実現する。

提案手法

3D 位置、SMPL 形状・姿勢をカラーと密度に変換するアニメーション可能な NeRF を導入する。
近傍の SMPL 頂点の加重結合（ブレンドスキニング）を用いて、観測空間の点を canonical-space に変換するポーズガイド変形を使用する。
幾何学的先験を用いた3D マスクによる体積レンダリングで、ニューラル放射場から画像を生成する。
学習中に SMPL 推定を補正するための解析-by-synthesis による NeRF と SMPL パラメータの共同最適化と、ポーズの精密化を行う。
最適化を安定させるための背景正則化とポーズ正則化を組み込む。

実験結果

リサーチクエスチョン

RQ1単眼ビデオから学習された NeRF に対して、SMPL ガイド変形は制御可能でアニメーション可能な表現を実現できるか。
RQ2NeRF と SMPL パラメータの共同最適化は、基準手法と比較して3D 幾何と外観の品質を改善するか。
RQ3単眼入力下での堅牢な再構成とアニメーションを達成するにはポーズ精練化が必要か。
RQ4canonical pose の選択と背景正則化が再構成と新規ポーズ合成に与える影響は何か。
RQ5提案手法は再構成された人物の新規視点合成と新規ポーズ合成をどれほど支援できるか。

主な発見

Subject ID	NeRF PSNR	SMPLpix PSNR	NB PSNR	NeRF+U PSNR	OURS PSNR	NeRF SSIM	SMPLpix SSIM	NB SSIM	NeRF+U SSIM	OURS SSIM	NeRF LIPIS	SMPLpix LIPIS	NB LIPIS	NeRF+U LIPIS	OURS LIPIS
male-3-casual	20.64	23.74	24.94	23.88	29.37	.8993	.9229	.9428	.9329	.9703	.1008	.0222	.0326	.0438	.0168
male-4-casual	20.29	22.43	24.71	23.13	28.37	.8803	.9095	.9469	.9276	.9605	.1445	.0305	.0423	.0554	.0268
female-3-casual	17.43	22.33	23.87	22.45	28.91	.8605	.9288	.9504	.9413	.9743	.1696	.0270	.0346	.0498	.0215
female-4-casual	17.63	23.35	24.37	23.13	28.90	.8578	.9258	.9451	.9276	.9678	.1827	.0239	.0382	.0556	.0174
iper-009-4-1	19.54	20.25	25.46	21.56	30.23	.7870	.9018	.9378	.8667	.9466	.2641	.0293	.0558	.1197	.0335
iper-023-1-1	17.41	19.48	25.44	20.25	27.26	.7623	.8945	.9330	.8656	.9457	.2769	.0442	.0493	.1109	.0285
iper-002-1-1	16.01	19.64	23.06	18.75	26.99	.7500	.8886	.9394	.8708	.9502	.3363	.0392	.0476	.1205	.0285
iper-026-1-1	17.09	19.03	23.77	18.48	26.85	.7580	.8574	.9351	.8623	.9542	.2928	.0494	.0550	.1282	.0315

単眼動画からの高品質な暗黙的幾何と外観を実現し、衣服のしわや髪の毛などの観察可能なディテールを再現する。
canonical NeRF 空間を活用して、アニメーション化された人物の新規ビューレンダリングをフォトリアリスティックに実現する。
ポーズガイドが明示されていない NeRF 系よりも、3D 再構成指標（P2S や Chamfer）が優れた結果を示す。
iPER および People-Snapshot データセットにおいて、NeRFpix および NB より新規ポーズ合成性能が高い。
学習中のポーズ精練化は、SMPL 推定が不完全な場合のレンダリング品質を大幅に向上させる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。