Skip to main content
QUICK REVIEW

[论文解读] B-CNN: Branch Convolutional Neural Network for Hierarchical Classification

Xinqi Zhu, Michael Bain|arXiv (Cornell University)|Sep 28, 2017
Advanced Neural Network Applications参考文献 29被引用 108
一句话总结

B-CNN 在 CNN 上增加分支输出,沿着粗到细的层次进行预测,并通过 Branch Training strategy (BT-strategy) 进行训练;它在 MNIST、CIFAR-10 和 CIFAR-100 上优于基线 CNN。

ABSTRACT

Convolutional Neural Network (CNN) image classifiers are traditionally designed to have sequential convolutional layers with a single output layer. This is based on the assumption that all target classes should be treated equally and exclusively. However, some classes can be more difficult to distinguish than others, and classes may be organized in a hierarchy of categories. At the same time, a CNN is designed to learn internal representations that abstract from the input data based on its hierarchical layered structure. So it is natural to ask if an inverse of this idea can be applied to learn a model that can predict over a classification hierarchy using multiple output layers in decreasing order of class abstraction. In this paper, we introduce a variant of the traditional CNN model named the Branch Convolutional Neural Network (B-CNN). A B-CNN model outputs multiple predictions ordered from coarse to fine along the concatenated convolutional layers corresponding to the hierarchical structure of the target classes, which can be regarded as a form of prior knowledge on the output. To learn with B-CNNs a novel training strategy, named the Branch Training strategy (BT-strategy), is introduced which balances the strictness of the prior with the freedom to adjust parameters on the output layers to minimize the loss. In this way we show that CNN based models can be forced to learn successively coarse to fine concepts in the internal layers at the output stage, and that hierarchical prior knowledge can be adopted to boost CNN models' classification performance. Our models are evaluated to show that the B-CNN extensions improve over the corresponding baseline CNN on the benchmark datasets MNIST, CIFAR-10 and CIFAR-100.

研究动机与目标

  • 在 CNN 中动机化并形式化分层分类,利用类层次结构。
  • 引入输出自粗到细多个预测的 B-CNN 架构。
  • 提出 BT-strategy 以在先验层次结构和端到端学习之间取得平衡。
  • 在 MNIST、CIFAR-10 和 CIFAR-100 上演示相对于传统 CNN 基线的经验提升。

提出的方法

  • 在不同深度集成多个分支网络,以产生对应等级别标签树的层次预测。
  • 将损失定义为跨越所有层次的交叉熵损失的加权和(Equation 1)。
  • 使用损失权重 A_k(和为 1)来控制每个层次对总损失的贡献(Section 3.3)。
  • 引入 Branch Training strategy(BT-strategy),在训练过程中将损失权重从粗到细层次转移,以缓解梯度消失(Section 3.4)。
  • 分支可作为在 CNN 特征之上的全连接网络实现(在实验中简化)。
  • 评估将 B-CNN 变体与基线在 MNIST、CIFAR-10、CIFAR-100 上进行对比,使用 SGD 和标准 CNN 组件(Tables 1-3)。

实验结果

研究问题

  • RQ1是否可以将层次化的类别结构嵌入到 CNN 中,从而产生可解释的、从粗到细的预测?
  • RQ2基于分支的损失结合 BT-strategy 是否能在分层任务上提升 CNN 的性能,相较于扁平的 CNN?
  • RQ3在 MNIST、CIFAR-10 和 CIFAR-100 上,B-CNN 相对于传统 CNN 基线的表现如何?

主要发现

ModelMNISTCIFAR-10CIFAR-100
Base A99.27%--
B-CNN A99.40%--
Base B-82.35%51.00%
B-CNN B-84.41%57.59%
Base C-87.96%62.92%
B-CNN C-88.22%64.42%
  • B-CNN 在 MNIST、CIFAR-10 和 CIFAR-100 上的表现始终优于其基线 CNN 对手(Table 3)。
  • 在 MNIST 上,B-CNN A 达到 99.40% 而基线 A 为 99.27%。
  • 在 CIFAR-10 上,B-CNN B 达到 84.41% 而基线 B 为 82.35%。
  • 在 CIFAR-100 上,B-CNN B 达到 57.59% 而基线 B 为 51.00%;B-CNN C 达到 64.42% 而基线 C 为 62.92%。
  • BT-strategy 在将损失焦点转向更细的层次后加速学习,并且可以防止梯度消失效应。
  • 使用预训练参数进行初始化时,相较于随机初始化,观察到的 BT-strategy 增益会降低。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。