QUICK REVIEW

[논문 리뷰] A Comprehensive guide to Bayesian Convolutional Neural Network with Variational Inference

Kumar Shridhar, Felix Laumann|arXiv (Cornell University)|2019. 01. 08.

Generative Adversarial Networks and Image Synthesis참고 문헌 54인용 수 161

한 줄 요약

이 논문은 Bayes by Backprop 기반의 Variational Inference를 사용하는 Bayesian CNN을 도입하고, 평균과 분산의 두 가지 컨볼루션 연산으로 불확실성을 정량화하며, 이를 이미지 분류, 초해상도, GAN에 적용합니다. 또한 가지치기 및 효율성 개선에 대해 논의합니다.

ABSTRACT

Artificial Neural Networks are connectionist systems that perform a given task by learning on examples without having prior knowledge about the task. This is done by finding an optimal point estimate for the weights in every node. Generally, the network using point estimates as weights perform well with large datasets, but they fail to express uncertainty in regions with little or no data, leading to overconfident decisions. In this paper, Bayesian Convolutional Neural Network (BayesCNN) using Variational Inference is proposed, that introduces probability distribution over the weights. Furthermore, the proposed BayesCNN architecture is applied to tasks like Image Classification, Image Super-Resolution and Generative Adversarial Networks. The results are compared to point-estimates based architectures on MNIST, CIFAR-10 and CIFAR-100 datasets for Image CLassification task, on BSD300 dataset for Image Super Resolution task and on CIFAR10 dataset again for Generative Adversarial Network task. BayesCNN is based on Bayes by Backprop which derives a variational approximation to the true posterior. We, therefore, introduce the idea of applying two convolutional operations, one for the mean and one for the variance. Our proposed method not only achieves performances equivalent to frequentist inference in identical architectures but also incorporate a measurement for uncertainties and regularisation. It further eliminates the use of dropout in the model. Moreover, we predict how certain the model prediction is based on the epistemic and aleatoric uncertainties and empirically show how the uncertainty can decrease, allowing the decisions made by the network to become more deterministic as the training accuracy increases. Finally, we propose ways to prune the Bayesian architecture and to make it more computational and time effective.

연구 동기 및 목표

CNN에 베이지안 학습을 도입하여 예측 불확실성을 표현하고 학습을 정규화합니다.
Bayes by Backprop를 기반으로 CNN 가중치에 대한 효율적인 변분 추론 접근법을 제안합니다.
두 개의 합성곱 연산(평균 및 분산)을 수행하는 방법을 보여주고 CNN에 지역 재매개화를 적용합니다.
불확실성 추정(에피스템틱 및 알레이터릭)을 시연하고 훈련에 따라 불확실성이 어떻게 감소하는지 보여줍니다.
정확도를 유지하면서 매개변수를 줄이기 위한 가지치기 전략(L1 정규화)을 탐구합니다.]
method_numbered_missing_note

제안 방법

Bayes by Backprop를 채택하여 CNN 가중치의 진짜 사후를 q(w)라는 변분 분포로 근사합니다.
가중치 불확실성을 Gaussian 변분 포스트로 표현하고, 평균과 분산을 두 개의 순차적 합성곱(하나는 평균용, 하나는 분산용)으로 학습합니다.
CNN에 지역 재매개화 트릭을 적용하여 가중치 대신 활성화를 샘플링함으로써 계산 효율성을 향상시킵니다.
변분 자유 에너지를 도출하고 최적화합니다(KL 발산 항과 기대 로그 가능도의 합).
L1 정규화를 사용하여 불필요한 가중치를 가지치고, 가지치기된 모델의 성능을 회복하기 위해 미세 조정합니다.
분류뿐 아니라 이미지 초해상도 및 Generative Adversarial Networks를 포함한 작업에 Bayesian CNN을 확장합니다.]
research_questions_missing_note

실험 결과

연구 질문

RQ1Bayes by Backprop를 CNN에 효율적으로 적용하여 보정된 불확실성 추정을 생산할 수 있는가?
RQ2두 가지 합성곱 접근(mean과 variance)이 점 추정 CNN에 비해 성능 및 정규화 측면에서 어떤 차이가 있는가?
RQ3이미지 관련 작업에서 Bayesian CNN이 에피스템틱 및 알레이터릭 불확실성에 미치는 영향은 무엇인가?
RQ4불확실성 인식 CNN를 정확도를 희생하지 않고 효과적으로 가지치기할 수 있는가, SR 및 GAN 작업에 이것이 얼마나 확장되는가?
RQ5Bayesian CNN이 MNIST, CIFAR 등 표준 데이터셋에서 빈도주의 아키텍처에 비해 경쟁력 있는 결과를 내는가?

주요 결과

변분 추론을 가진 Bayesian CNN은 유사 모델의 점 추정 아키텍처와 동등한 성능을 달성할 수 있다.
불확실성은 에피스템틱과 알레이터릭 구성요소로 분해되며, 훈련 정확도가 향상됨에 따라 불확실성이 감소하여 결정이 더 결정적으로 된다.
두 개의 합성곱 연산 체계는 가중치의 평균과 분산을 학습하도록 하여 파라미터 총량을 두 배로 늘리지 않고도 가능하게 한다.
지역 재매개화 트릭은 가중치 대신 활성화를 샘플링하여 합성곱 계층의 학습 속도를 높인다.
L1 정규화를 통한 가지치기는 예측 성능의 최소 또는 무손실로 매개변수 수를 감소시키며 모델 효율을 향상시킨다.
Bayesian 프레임워크는 이미지 분류, 이미지 초해상도, GAN 작업 전반에 걸쳐 비-Bayesian 기준선과의 비교에서 입증된다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.