QUICK REVIEW

[論文レビュー] Rotated Binary Neural Network

Mingbao Lin, Rongrong Ji|arXiv (Cornell University)|Sep 28, 2020

Advanced Neural Network Applications参考文献 46被引用数 68

ひとこと要約

RBNNは、エポックごとの重み回転を用いた二回転スキームにより、全精度重みと二値化版の間の角度的バイアスを低減し、CIFAR-10およびImageNetで二値ニューラルネットの精度を向上させる。

ABSTRACT

Binary Neural Network (BNN) shows its predominance in reducing the complexity of deep neural networks. However, it suffers severe performance degradation. One of the major impediments is the large quantization error between the full-precision weight vector and its binary vector. Previous works focus on compensating for the norm gap while leaving the angular bias hardly touched. In this paper, for the first time, we explore the influence of angular bias on the quantization error and then introduce a Rotated Binary Neural Network (RBNN), which considers the angle alignment between the full-precision weight vector and its binarized version. At the beginning of each training epoch, we propose to rotate the full-precision weight vector to its binary vector to reduce the angular bias. To avoid the high complexity of learning a large rotation matrix, we further introduce a bi-rotation formulation that learns two smaller rotation matrices. In the training stage, we devise an adjustable rotated weight vector for binarization to escape the potential local optimum. Our rotation leads to around 50% weight flips which maximize the information gain. Finally, we propose a training-aware approximation of the sign function for the gradient backward. Experiments on CIFAR-10 and ImageNet demonstrate the superiorities of RBNN over many state-of-the-arts. Our source code, experimental settings, training logs and binary models are available at https://github.com/lmbxmu/RBNN.

研究の動機と目的

全精度重みと二値化された対応物との間の角度バイアスに対処することで、Binary Neural Networks (BNNs)における量子化誤差の低減を動機づける。
角度の差を最小化するために、重みを二値頂点と整列させる回転ベースのフレームワークを提案する。
計算効率の高い二回転スキームを導入し、2つの小さな回転行列で大きな回転を実現する。
二値化を通じた効果的な逆伝播を可能にするトレーニング対応勾配近似を開発する。

提案手法

各学習エポックの開始時に、回転行列 R i を全精度重みベクトル w i に適用して、R i^T w i と sign(R i^T w i) の間の角度を最小化する。
複雑さを低減するため、二回転構成 R i = R1 i ⊗ R2 i を用い、R1 i ∈ R^{n1×n1}, R2 i ∈ R^{n2×n2}, および n i = n1 i · n2 i とする。
直交性制約の下で、B W′ i, R1 i, R2 i を交互に最適化して tr(B W′ i (R2 i)^T (W i)^T R1 i) を最大化し、結果として B W′ i = sign((R1 i)^T W i R2 i) となり、R1 i, R2 i は SVD に基づく極分解を用いて更新される。
調整可能な回転重みベクトルを導入する: w̃ i = w i + ((R i)^T w i − w i) · α i、二値化を動的に導き、局所最適解を避ける。α i ∈ [0,1]。
二値化を通じたバックプロパゲーションのためのトレーニング対応勾配近似 F(x) を提供し、導関数 F′(x) はトレーニングの進行に合わせて調整される（e ∕ E）。
RBNN のエンドツーエンド訓練を可能にするため、w i、α i、および補助量の勾配を計算する。

実験結果

リサーチクエスチョン

RQ1各学習エポックで重みを回転させて二値頂点と整列させることにより、全精度重みとそれらの二値化バージョンとの間の角度バイアスを低減できるか？
RQ2二回転アプローチ（2つの小さな回転行列）は、BNNにおける重みの整列のための大きな回転を効率的かつ効果的に近似できるか？
RQ3トレーニング対応の二値化と調整可能な回転重みは、CIFAR-10およびImageNetで既存のBNN手法と比較して精度を向上させるか？

主な発見

RBNNは、CIFAR-10においてResNet-18、ResNet-20、VGG-smallのいずれもで、同等のビット設定の下、いくつかの最先端BNNを一貫して上回る。
ImageNetでは、ResNet-18とResNet-34において、それぞれIR-NetをTop-1およびTop-5で上回る改善を達成。
二回転スキームは、大きな回転を記憶容量と計算量を削減して効率的に近似できる。
トレーニング対応勾配近似は、STE、PPF、EDEと比較して二値化を介した逆伝播を強化する。
重み回転により各層のウェイトフリップが約50%程度に増加し、訓練中の情報利得を最大化する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。