QUICK REVIEW

[論文レビュー] Training Binary Neural Networks with Real-to-Binary Convolutions

Brais Martínez, Jing Yang|arXiv (Cornell University)|Mar 25, 2020

Advanced Neural Network Applications参考文献 28被引用数 69

ひとこと要約

paper trains binary neural networks to near full-precision accuracy by building a strong baseline and reducing the gap through real-to-binary alignment via attention matching and data-driven channel re-scaling.

ABSTRACT

This paper shows how to train binary networks to within a few percent points ($\sim 3-5 \%$) of the full precision counterpart. We first show how to build a strong baseline, which already achieves state-of-the-art accuracy, by combining recently proposed advances and carefully adjusting the optimization procedure. Secondly, we show that by attempting to minimize the discrepancy between the output of the binary and the corresponding real-valued convolution, additional significant accuracy gains can be obtained. We materialize this idea in two complementary ways: (1) with a loss function, during training, by matching the spatial attention maps computed at the output of the binary and real-valued convolutions, and (2) in a data-driven manner, by using the real-valued activations, available during inference prior to the binarization process, for re-scaling the activations right after the binary convolution. Finally, we show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet and reduces the gap to its real-valued counterpart to less than 3% and 5% top-1 accuracy on CIFAR-100 and ImageNet respectively when using a ResNet-18 architecture. Code available at https://github.com/brais-martinez/real2binary.

研究の動機と目的

Aim to close the performance gap between binary and real-valued networks on standard benchmarks.
Construct a strong baseline by integrating recent binary-network techniques and optimized training strategies.
Introduce real-to-binary attention matching to guide binary optimization.
Introduce data-driven, activation-informed channel re-scaling to enhance binary convolution capacity.

提案手法

Build a strong ResNet-18-based binary baseline with optimized block structure and training regime.
Introduce real-to-binary attention matching by aligning normalized attention maps at selected blocks with a real-valued teacher.
Adopt a progressive teacher-student training scheme to bridge architectural gaps between real and binary networks.
Develop a data-driven gating function G that predicts channel-wise scale factors from pre-binarization real activations to re-scale binary conv outputs.
Maintain near-constant binary FLOPs with only ~1% additional FLOPs due to the scaling/gating components.
Provide a cost analysis comparing BNN, XNOR-Net, Bi-Real, and the proposed method.

実験結果

リサーチクエスチョン

RQ1Can a strong baseline push binary networks to state-of-the-art accuracy on ImageNet without increasing binary operations?
RQ2Does aligning binary convolutions with real-valued counterparts via attention transfer improve training signals for binary networks?
RQ3Can data-driven, activation-informed channel re-scaling substantially bridge the gap to real-valued networks?
RQ4How does a progressive teacher-student strategy impact binary network optimization?

主な発見

Method	Bitwidth (W/A)	Top-1	Top-5
BNN (Courbariaux et al., 2016)	1/32	60.8	83.0
TTQ (Zhu et al., 2017)	2/32	66.6	87.2
HWGQ (Cai et al., 2017)	1/2	59.6	82.2
LQ-Net (Zhang et al., 2018)	1/2	62.6	84.3
SYQ (Faraone et al., 2018)	1/2	55.4	78.6
DOREFA-Net (Zhou et al., 2016)	2/2	62.6	84.4
ABC-Net (Lin et al., 2017)	(1/1) × 5	65.0	85.9
Circulant CNN (Liu et al., 2019)	(1/1) × 4	61.4	82.8
Struct Appr (Zhuang et al., 2019)	(1/1) × 4	64.2	85.6
Struct Appr** (Zhuang et al., 2019)	(1/1) × 4	66.3	86.6
Ensemble (Zhu et al., 2019)	(1/1) × 6	61.0	–
BNN (Courbariaux et al., 2016)	1/1	42.2	69.2
XNOR-Net (Rastegari et al., 2016)	1/1	51.2	73.2
Trained Bin (Xu & Cheung, 2019)	1/1	54.2	77.9
Bi-Real Net (Liu et al., 2018)	1/1	56.4	79.5
CI-Net (Wang et al., 2019)	1/1	56.7	80.1
XNOR-Net++ (Bulat & Tzimiropoulos, 2019)	1/1	57.1	79.9
CI-Net (Wang et al., 2019)	1/1	59.9	84.2
Strong Baseline (ours)	1/1	60.9	83.0
Real-to-Bin (ours)	1/1	65.4	86.2
Real valued	32/32	69.3	89.2
Real valued T-S	32/32	70.7	90.0

The strong baseline surpasses all previously published binary-network results on ImageNet by about 1% top-1 accuracy.
The proposed real-to-binary attention matching and progressive teacher-student strategy provide significant improvements, achieving over 5% top-1 gain on ImageNet.
The real-to-bin method reduces the gap to the real-valued counterpart to about 4% top-1 on CIFAR-100 and about 5% on ImageNet for ResNet-18.
Data-driven channel re-scaling bridges more than a third of the remaining gap between binary and real-valued networks.
On ImageNet with ResNet-18, Real-to-Bin achieves 65.4% top-1 and 86.2% top-5, compared to 60.9%/83.0% for the strong baseline and 69.3%/89.2% for full precision.
The method maintains similar computational cost to prior binary nets, with only ~1% increase in FLOPs.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。