QUICK REVIEW

[論文レビュー] OMEGA-Avatar: One-shot Modeling of 360° Gaussian Avatars

Zehao Xia, Yiqun Wang|arXiv (Cornell University)|Feb 12, 2026

3D Shape Modeling and Analysis被引用数 0

ひとこと要約

要約: OMEGA-Avatar は、単一画像から多視差拡散プリオリと UV ベースの特徴飛散パイプラインを用いて、一般化可能で 360° 完全かつアニメーション可能な 3D ガウス頭部アバターをフィードフォワードで生成する手法です。

ABSTRACT

Creating high-fidelity, animatable 3D avatars from a single image remains a formidable challenge. We identified three desirable attributes of avatar generation: 1) the method should be feed-forward, 2) model a 360° full-head, and 3) should be animation-ready. However, current work addresses only two of the three points simultaneously. To address these limitations, we propose OMEGA-Avatar, the first feed-forward framework that simultaneously generates a generalizable, 360°-complete, and animatable 3D Gaussian head from a single image. Starting from a feed-forward and animatable framework, we address the 360° full-head avatar generation problem with two novel components. First, to overcome poor hair modeling in full-head avatar generation, we introduce a semantic-aware mesh deformation module that integrates multi-view normals to optimize a FLAME head with hair while preserving its topology structure. Second, to enable effective feed-forward decoding of full-head features, we propose a multi-view feature splatting module that constructs a shared canonical UV representation from features across multiple views through differentiable bilinear splatting, hierarchical UV mapping, and visibility-aware fusion. This approach preserves both global structural coherence and local high-frequency details across all viewpoints, ensuring 360° consistency without per-instance optimization. Extensive experiments demonstrate that OMEGA-Avatar achieves state-of-the-art performance, significantly outperforming existing baselines in 360° full-head completeness while robustly preserving identity across different viewpoints.

研究の動機と目的

単一画像から高忠実度でアニメーション可能な 3D アバターをフィードフォワード枠組み内で生成することを動機づける。
マルチビュー priors と意味論的アウェア変形を統合して 360° 全頭モデリングを実現し、トポロジーを preserving。
パラメトリック FLAME ベース表現を通じて明示的な表情・姿勢制御を可能にし、アニメーション準備性を確保する。
個別 subject の最適化を避けつつ高い幾何・外観忠実性を維持する、一般化可能なアプローチを開発する。

提案手法

1 枚の肖像画像から拡散モデルを用いて多視点 RGB 画像と法線マップを合成する。
意味論的に意識したメッシュ変形を適用して、トポロジーを保持しつつ髪を含む FLAME 頭部を最適化する。
表現制御用の頂点ガウスと外観用の UV ガウスの二重ガウス枝を用いて、標準化済みの全頭表現を構築する。
微分可能な二値化ブリッジを介して共有 canonical UV マップへ多視点特徴を統合する（階層 UV マッピングと可視性認証による融合を含む）。
UV および頂点ガウスをデコードして変形 FLAME メッシュに結合し、ターゲット表情・姿勢をアニメーションへ注入する。ニューラルレンダラーで refine する。

実験結果

リサーチクエスチョン

RQ1単一画像パイプラインで、個別 subject の最適化なしに 360° 完全かつアニメーション可能な 3D 頭部を生成できるか？
RQ2多視点 priors を前方デコーダ可能な UV 特徴空間へ取り込み、視点間の一貫性を維持するにはどうするか？
RQ3意味論に基づく変形と微分可能な特徴融合は、視点間の 3D 完全性と同一性の保持を高めるか？

主な発見

方法	PSNR_Ava	SSIM_Ava	LPIPS_Ava	CSIM_Ava	DS_Ava	PSNR_NeR	SSIM_NeR	LPIPS_NeR	CSIM_NeR	DS_NeR
PanoHead	17.995	0.5932	0.2592	0.4556	0.1663	17.151	0.6098	0.2706	0.5742	0.1082
SphereHead	18.932	0.6238	0.2436	0.4714	0.1509	17.446	0.6183	0.2628	0.5781	0.1053
GAGAvatar	22.802	0.7714	0.1719	0.4682	0.1762	22.556	0.8027	0.1505	0.6867	0.1033
SOAP	21.335	0.7287	0.1948	0.5783	0.1890	19.539	0.7471	0.1975	0.6855	0.1128
LAM	21.802	0.7314	0.1997	0.4842	0.1818	20.263	0.7465	0.1853	0.6321	0.1118
Ours	23.244	0.7734	0.1592	0.5403	0.1651	23.221	0.8051	0.1435	0.6714	0.0950

OMEGA-Avatar は、ジオメトリとテクスチャ忠実度の両方でベースラインと比較して 360° 全頭生成の最先端性能を達成。
定量的結果は Avatar-256 および NeRSemble データセットでベースラインを上回る（CSIM および PSNR の改善など）とともに、360° の一貫性が向上。
頂点ガウスと UV ガウスというデュアルガウス頭部表現は、表情の制御を保ちつつ幾何学的安定性を維持したままアニメーションを可能にする。
階層的 UV マップと可視性認証融合を伴う多視点特徴飛散は、コヒーレントな全頭デコードのための堅牢な多視点統合を提供する。
拡散モデルベースの多視点合成プリオリは、個別最適化なしでも背面頭部や髪のディテールを plausibly 再現可能。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。