QUICK REVIEW

[論文レビュー] DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling

Xiaoguang Han, Chang Gao|The HKU Scholars Hub (University of Hong Kong)|Jun 7, 2017

Face recognition and analysis参考文献 46被引用数 78

ひとこと要約

本論文は、bilinear morphable representation を用いて、2D 顔スケッチを3D の顔またはカリカチュアモデルへ変換する CNN ベースの深層回帰システムを提案し、対話的なスケッチとジェスチャーを用いた refinements に対応し、拡張された 15,000-model データベースで学習している。

ABSTRACT

Face modeling has been paid much attention in the field of visual computing. There exist many scenarios, including cartoon characters, avatars for social media, 3D face caricatures as well as face-related art and design, where low-cost interactive face modeling is a popular approach especially among amateur users. In this paper, we propose a deep learning based sketching system for 3D face and caricature modeling. This system has a labor-efficient sketching interface, that allows the user to draw freehand imprecise yet expressive 2D lines representing the contours of facial features. A novel CNN based deep regression network is designed for inferring 3D face models from 2D sketches. Our network fuses both CNN and shape based features of the input sketch, and has two independent branches of fully connected layers generating independent subsets of coefficients for a bilinear face representation. Our system also supports gesture based interactions for users to further manipulate initial face models. Both user studies and numerical results indicate that our sketching system can help users create face models quickly and effectively. A significantly expanded face database with diverse identities, expressions and levels of exaggeration is constructed to promote further research and evaluation of face modeling techniques.

研究の動機と目的

アマチュアやデザイナー向けに、労力を抑えつつ表現力のある3D顔モデリングを動機づける。
2D スケッチを3D顔メッシュ係数へ写像する深層回帰モデルを開発する。
identityとexpressionを分離するためにbilinear morphable representationを活用する。
初期スケッチ、追加入力スケッチ、ジェスチャーによる refinement を備えた対話的インターフェースを提供する。
トレーニングと評価のために大規模で多様な顔データベースを構築・公開する。

提案手法

2D スケッチを 256x256 バイナリ画像として表現し、AlexNet に触発されたネットワークを通じてCNNベースの特徴を抽出する。
スケッチ係数から3D頂点を再構成するため、identityとexpressionモードを持つbilinear morphableモデルを同時に学習する。
干渉を避けるため、ネットワーク内に2つの独立したブランチを用いて、identity (u) と expression (v) ベクトルを予測する。
シルエット/輪郭上の点をサンプリングして固定長ベクトルとして各ブランチに入力することで、形状レベルの入力を組み込む。
教師データの頂点位置と予測頂点位置のL2距離を最小化する頂点損失層を導入し、エンドツーエンドの学習を実現する。
3段階で訓練する：まず identity/expression 分類、次に u-v 回帰、最後に頂点レベルの損失での微調整；データを合成スケッチと手描きスケッチで増強する。

実験結果

リサーチクエスチョン

RQ1CNNベースの回帰モデルは、フリーハンドの2Dスケッチから正確に3D顔形状を推定できるか。
RQ2bilinear morphable representationをどう活用して、スケッチから3D推論におけるidentityとexpressionを分離できるか。
RQ3追加入力スケッチとジェスチャー refinementsを備えた労力を抑えたスケッチインターフェースは、迅速に実用的な3D顔を生成できるか。
RQ4拡張された多様な3D顔データベースが、スケッチベースの推論の訓練と評価に及ぼす影響は何か。

主な発見

深層回帰ネットワークは、bilinear identity-expressionモデルを介して2Dスケッチから3D顔メッシュを推定できる。
2ブランチアーキテクチャは、identityとexpression係数間の干渉を低減し、再構成精度を向上させる。
初期スケッチ、追加入力スケッチ、ジェスチャーに基づく refinement の3つの対話モードをサポートし、リアルタイムでのモデリング更新を可能にする。
15,000 メッシュ（150 identities、25 expressions、4 exaggeration levels）の拡張データベースは、スケッチベースの顔モデリングの訓練と評価を改善する。
ユーザ調査では、アマチュアのユーザーが数分で3D顔またはカリカチュアモデルを作成できる（例は10分未満を示し、平均は約8分）。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。