QUICK REVIEW

[논문 리뷰] ProxQuant: Quantized Neural Networks via Proximal Operators

Yu Bai, Yu-Xiang Wang|arXiv (Cornell University)|2018. 09. 27.

Advanced Neural Network Applications인용 수 49

한 줄 요약

ProxQuant는 정규화된 최적화 문제를 풀기 위해 프록시-그라디언트 강하를 사용하는 방식으로, 양자화 신경망 학습을 위한 직선 통과 그라디언트 방법의 이론적 대안을 제안한다. 이는 이분법적 양자화에서 최신 기법들을 능가하고, 다비트 양자화에서는 그들과 동등한 성능을 보이며, BinaryConnect보다 더 높은 안정성을 보여준다.

ABSTRACT

To make deep neural networks feasible in resource-constrained environments (such as mobile devices), it is beneficial to quantize models by using low-precision weights. One common technique for quantizing neural networks is the straight-through gradient method, which enables back-propagation through the quantization mapping. Despite its empirical success, little is understood about why the straight-through gradient method works. Building upon a novel observation that the straight-through gradient method is in fact identical to the well-known Nesterov's dual-averaging algorithm on a quantization constrained optimization problem, we propose a more principled alternative approach, called ProxQuant, that formulates quantized network training as a regularized learning problem instead and optimizes it via the prox-gradient method. ProxQuant does back-propagation on the underlying full-precision vector and applies an efficient prox-operator in between stochastic gradient steps to encourage quantizedness. For quantizing ResNets and LSTMs, ProxQuant outperforms state-of-the-art results on binary quantization and is on par with state-of-the-art on multi-bit quantization. For binary quantization, our analysis shows both theoretically and experimentally that ProxQuant is more stable than the straight-through gradient method (i.e. BinaryConnect), challenging the indispensability of the straight-through gradient method and providing a powerful alternative.

연구 동기 및 목표

양자화 신경망 학습에서 직선 통과 그라디언트 방법의 경험적 성공 배경에 대한 이론적 이해 부족 문제를 해결하기 위해.
안정적이고 효과적인 양자화를 보장하는 직선 통과 그라디언트 방법의 더 이론적으로 타당한 대안을 개발하기 위해.
특히 ResNets와 LSTMs에서의 딥 신경망 이분법적 및 다비트 양자화 성능 향상하기 위해.
직선 통과 그라디언트 방법과 양자화 제약 조건이 있는 최적화 문제에 대한 네스터프의 이중 평균 알고리즘 사이의 공식적 연결을 수립하기 위해.

제안 방법

ProxQuant는 양자화 제약 조건이 있는 정규화된 학습 문제로 양자화 신경망 학습을 공식화한다.
기울기를 계산하기 위해 전체 정밀도 가중치 벡터에 대해 역전파를 수행한다.
확률적 그라디언트 단계 사이에, 효율적인 프록시 연산자를 적용하여 양자화된 가중치를 강제한다.
프록시 연산자는 유효한 양자화된 가중치 집합 위로의 사영으로 작용하며, 저해상도 솔루션으로의 수렴을 촉진한다.
이 방법은 프록시-그라디언트 최적화 프레임워크에 기반하여 있어 이론적 안정성과 수렴성을 보장한다.
이 방법은 이분법적 및 다비트 양자화 모두에 적용되며, ResNets와 LSTMs에서 평가된다.

실험 결과

연구 질문

RQ1직선 통과 그라디언트 방법은 이론적 근거 없이도 왜 성공하는가?
RQ2양자화 신경망 학습을 위한 더 이론적으로 타당한 최적화 프레임워크를 개발할 수 있는가?
RQ3ProxQuant의 성능는 이분법적 및 다비트 양자화에서 직선 통과 그라디언트 방법과 비교해 어떻게 되는가?
RQ4실제로 ProxQuant는 BinaryConnect보다 더 안정한가?
RQ5직선 통과 그라디언트 방법과 기존 최적화 알고리즘 사이의 이론적 관계는 무엇인가?

주요 결과

ProxQuant는 ResNets와 LSTMs의 이분법적 양자화에서 최신 기법들을 능가한다.
다비트 양자화에서는 최신 기법들과 동등한 성능를 달성한다.
이론적으로나 실험적으로나, ProxQuant는 직선 통과 그라디언트 방법(즉, BinaryConnect)보다 더 안정하다.
논문은 직선 통과 그라디언트 방법이 양자화 제약 조건이 있는 문제에 대해 네스터프의 이중 평균 알고리즘과 동일하다는 것을 증명한다.
ProxQuant의 프록시 연산자는 학습 중에 양자화된 가중치를 효과적이고 효율적으로 강제하는 데 기여한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.