QUICK REVIEW

[논문 리뷰] Bayesian Compression for Deep Learning

Christos Louizos, Karen Ullrich|arXiv (Cornell University)|2017. 05. 24.

Gaussian Processes and Bayesian Inference참고 문헌 61인용 수 349

한 줄 요약

The paper proposes a Bayesian framework for compressing neural networks by pruning neurons with group sparsity priors and by using posterior uncertainty to determine per-layer weight precision, achieving state-of-the-art compression while maintaining accuracy.

ABSTRACT

Compression and computational efficiency in deep learning have become a problem of great significance. In this work, we argue that the most principled and effective way to attack this problem is by adopting a Bayesian point of view, where through sparsity inducing priors we prune large parts of the network. We introduce two novelties in this paper: 1) we use hierarchical priors to prune nodes instead of individual weights, and 2) we use the posterior uncertainties to determine the optimal fixed point precision to encode the weights. Both factors significantly contribute to achieving the state of the art in terms of compression rates, while still staying competitive with methods designed to optimize for speed or energy efficiency.

연구 동기 및 목표

Motivate compression and efficiency in deep learning from a Bayesian perspective.
Develop a variational inference framework with sparsity-inducing priors to prune neuron groups.
Derive a method to estimate optimal fixed-point bit precision per layer from posterior uncertainties.
Demonstrate that group sparsity and adaptive precision lead to competitive compression and speedups.
Show that Bayesian methods can achieve high compression without sacrificing prediction accuracy.

제안 방법

Adopt a variational Bayes framework with sparsity-inducing priors for groups of weights feeding a neuron to prune entire neurons.
Use scale-mixtures of normals (including log-uniform and half-Cauchy/Horseshoe priors) to induce sparsity and enable group pruning.
Employ a non-centered reparametrization to derive an efficient ELBO with tractable KL terms and enable group sparsity via dropout-like mechanisms.
Leverage the bits-back argument and posterior uncertainty to determine per-layer fixed-point weight precision at test time.
Apply local reparametrizations to reduce gradient variance and enable efficient training of neural networks.
Compute test-time weight estimates via masked posterior means and variances to quantify bit-precision needs.

실험 결과

연구 질문

RQ1Can group-sparsity priors prune entire neurons effectively in modern architectures?
RQ2How can Bayesian uncertainty inform per-layer/weight-bit precision for efficient encoding?
RQ3Do sparsity-inducing priors enable competitive compression rates without sacrificing accuracy?
RQ4What are practical training and inference strategies to implement Bayesian compression on common networks?
RQ5How do different priors (log-uniform vs horseshoe) compare in promoting sparsity and compression?

주요 결과

The proposed Bayesian compression methods induce substantial group sparsity, reducing network size beyond several baselines.
Per-layer bit-precision determined from posterior uncertainties achieves significant memory savings with little to no accuracy loss.
Group horseshoe and group normal-Jeffreys priors yield strong compression and competitive or superior performance compared to existing pruning/quantization methods.
For networks like LeNet variants and VGG, substantial parameter pruning is achieved with meaningful reductions in bit-precision per layer.
The approach provides speedups and energy efficiency on CPU/GPU, with notable gains for larger networks (e.g., VGG).

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.