QUICK REVIEW

[論文レビュー] B-CNN: Branch Convolutional Neural Network for Hierarchical Classification

Xinqi Zhu, Michael Bain|arXiv (Cornell University)|Sep 28, 2017

Advanced Neural Network Applications参考文献 29被引用数 108

ひとこと要約

B-CNN は CNN に分岐出力を追加し、粗から細への階層で予測を行い、Branch Training strategy (BT-strategy) で訓練される；MNIST, CIFAR-10, CIFAR-100 でベースラインCNNより改善。

ABSTRACT

Convolutional Neural Network (CNN) image classifiers are traditionally designed to have sequential convolutional layers with a single output layer. This is based on the assumption that all target classes should be treated equally and exclusively. However, some classes can be more difficult to distinguish than others, and classes may be organized in a hierarchy of categories. At the same time, a CNN is designed to learn internal representations that abstract from the input data based on its hierarchical layered structure. So it is natural to ask if an inverse of this idea can be applied to learn a model that can predict over a classification hierarchy using multiple output layers in decreasing order of class abstraction. In this paper, we introduce a variant of the traditional CNN model named the Branch Convolutional Neural Network (B-CNN). A B-CNN model outputs multiple predictions ordered from coarse to fine along the concatenated convolutional layers corresponding to the hierarchical structure of the target classes, which can be regarded as a form of prior knowledge on the output. To learn with B-CNNs a novel training strategy, named the Branch Training strategy (BT-strategy), is introduced which balances the strictness of the prior with the freedom to adjust parameters on the output layers to minimize the loss. In this way we show that CNN based models can be forced to learn successively coarse to fine concepts in the internal layers at the output stage, and that hierarchical prior knowledge can be adopted to boost CNN models' classification performance. Our models are evaluated to show that the B-CNN extensions improve over the corresponding baseline CNN on the benchmark datasets MNIST, CIFAR-10 and CIFAR-100.

研究の動機と目的

CNN内で階層的分類を動機づけ、クラス階層を活用して形式化する。
粗いレベルから細いレベルまでの複数の予測を出力する B-CNN アーキテクチャを導入する。
事前の階層情報とエンドツーエンド学習のバランスを取る BT-strategy の提案。
MNIST, CIFAR-10, CIFAR-100 に対する従来のCNNベースラインより実証的な向上を示す。

提案手法

階層ラベルツリーのレベルに対応する予測を生成するため、さまざまな深さに複数の分岐ネットワークを統合する。
すべての階層レベルにまたがる交差エントロピー損失の加重和として損失を定義する（Equation 1）。
損失の各レベルへの寄与を制御するため、損失重み A_k（総和が 1）を用いる（Section 3.3）。
訓練中に損失重みを粗いレベルから細かいレベルへ移動させる Branch Training strategy (BT-strategy) を導入し、勾配消失を抑制する（Section 3.4）。
分岐は CNN の特徴に対する全結合ネットとして実装可能（実験では簡略化）される。
評価は MNIST, CIFAR-10, CIFAR-100 に対する B-CNN バリアントをベースラインと比較し、SGD および標準的な CNN コンポーネントを用いる（Tables 1-3）

実験結果

リサーチクエスチョン

RQ1階層的なクラス構造を CNN に埋め込み、解釈可能な粗から細への予測を得ることができるか？
RQ2BT-strategy を用いた分岐ベースの損失は、階層的タスクにおいて平坦な CNN と比べて性能を向上させるか？
RQ3MNIST, CIFAR-10, CIFAR-100 における B-CNN は従来の CNN ベースラインと比べてどう性能を示すか？

主な発見

モデル	MNIST	CIFAR-10	CIFAR-100
Base A	99.27%	-	-
B-CNN A	99.40%	-	-
Base B	-	82.35%	51.00%
B-CNN B	-	84.41%	57.59%
Base C	-	87.96%	62.92%
B-CNN C	-	88.22%	64.42%

B-CNN モデルは MNIST, CIFAR-10, CIFAR-100 で一貫してベースライン CNN と比較して上回る（Table 3）。
MNIST では、B-CNN A は 99.40%、base A は 99.27%。
CIFAR-10 では、B-CNN B は 84.41%、base B は 82.35% 。
CIFAR-100 では、B-CNN B は 57.59%、base B は 51.00%；B-CNN C は 64.42%、base C は 62.92%。
BT-strategy は損失焦点をより細かいレベルへ移した後に学習を加速し、勾配消失効果を防ぐことができる。
事前学習済みパラメータでの初期化は、ランダム初期化と比較してBT-strategy による利得を低減させる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。