QUICK REVIEW

[論文レビュー] Face shape classification using Inception v3

Adonis Emmanuel DC. Tio|arXiv (Cornell University)|Nov 15, 2019

Face and Expression Recognition参考文献 5被引用数 32

ひとこと要約

本稿では、Inception v3を用いた転移学習を用いた深層学習的手法を提案し、女性有名人の500枚の画像から成るデータセットで全体の正確度84.8%を達成した。手動による特徴量設計を不要とすることで、SVM、LDA、KNNといった従来手法を上回り、公開コードを伴う初のCNNを用いたこのタスクへの応用である。

ABSTRACT

In this paper, we present experimental results obtained from retraining the last layer of the Inception v3 model in classifying images of human faces into one of five basic face shapes. The accuracy of the retrained Inception v3 model was compared with that of the following classification methods that uses facial landmark distance ratios and angles as features: linear discriminant analysis (LDA), support vector machines with linear kernel (SVM-LIN), support vector machines with radial basis function kernel (SVM-RBF), artificial neural networks or multilayer perceptron (MLP), and k-nearest neighbors (KNN). All classifiers were trained and tested using a total of 500 images of female celebrities with known face shapes collected from the Internet. Results show that training accuracy and overall accuracy ranges from 98.0% to 100% and from 84.4% to 84.8% for Inception v3 and from 50.6% to 73.0% and from 36.4% to 64.6% for the other classifiers depending on the training set size used. This result shows that the retrained Inception v3 model was able to fit the training data well and outperform the other classifiers without the need to handpick specific features to include in model training. Future work should consider expanding the labeled dataset, preferably one that can also be freely distributed to the research community, so that proper model cross-validation can be performed. As far as we know, this is the first in the literature to use convolutional neural networks in face-shape classification. The scripts are available at https://github.com/adonistio/inception-face-shape-classifier.

研究の動機と目的

深層畳み込みニューラルネットワークを用いた顔の形状分類の自動化の可能性を検討すること。
手作業で作成した顔の特徴点を用いた特徴量と比較して、Inception v3の性能を評価すること。
事前学習済みモデルを用いた転移学習が、手動による特徴量設計を不要とすることで高い正確度を達成できることを示すこと。
今後の研究を支援するため、公開可能な実装を提供すること。

提案手法

事前学習済みInception v3モデルの最終全結合層を微調整し、5クラスの顔の形状分類を実行した。
インターネットから収集した500枚の女性有名人の画像から成る、顔の形状が既知のデータセットを用いた。
従来の分類器（LDA、SVM、MLP、KNN）の入力特徴量として、顔の特徴点間の距離と角度を抽出した。
汎化性能を評価するために、さまざまなトレーニングデータサイズで全モデルを訓練および評価した。
ImageNetで事前学習済みの重みを活用することで、トレーニング時間とデータ要件を削減した。
再現可能性と今後の研究を促進するため、コードを公開した。

実験結果

リサーチクエスチョン

RQ1豊富なデータや特徴量設計を要せず、Inception v3のような事前学習済みCNNを微調整して顔の形状分類に効果的に応用できるか？
RQ2顔の特徴点に基づく特徴量を用いた従来の機械学習モデルと比較して、Inception v3の性能はどの程度か？
RQ3限定的で現実世界のデータセットにおいて、Inception v3を用いた転移学習が従来手法を上回る正確度を達成できるか？
RQ4過学習を防ぎつつ、さまざまなトレーニングデータサイズでモデルが良好に一般化できるか？
RQ5エンドツーエンド学習を用いたCNNの顔の形状分類への応用は、本研究が初めてであるか？

主な発見

微調整されたInception v3モデルは、トレーニング正確度が98.0%から100%の範囲に達しており、トレーニングデータに強く適合していることが示された。
Inception v3の全体正確度は84.4%から84.8%の範囲にあり、テストされたすべての他の分類器を顕著に上回った。
最も優れた従来手法であるSVM-RBFは最大64.6%の正確度を達成したが、KNNとLDAは65%未満にとどまった。
Inception v3は、すべてのトレーニングデータサイズで、すべてのベースラインモデルを一貫して上回り、その頑健性が示された。
他の手法とは異なり、本モデルは手動による特徴量選択を一切必要としなかった。一方、他の手法は顔の特徴点間の距離比や角度依存の特徴量に依存していた。
著者らは、エンドツーエンド学習を用いたCNNを顔の形状分類に応用した本研究が、文献上初の成功例であると確認している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。