QUICK REVIEW

[논문 리뷰] Predictive Uncertainty Estimation via Prior Networks

Andrey Malinin, Mark Gales|arXiv (Cornell University)|2018. 02. 28.

Adversarial Robustness in Machine Learning참고 문헌 28인용 수 357

한 줄 요약

본 논문은 Prior Networks (PNs)를 도입하여 데이터 불확실성 및 모델 불확실성과는 별도로 분포 불확실성을 명시적으로 모델링하고, OOD 탐지 및 오분류 탐지 성능을 향상시키며, Dirichlet Prior Networks (DPNs)를 MNIST와 CIFAR-10에 적용한다.

ABSTRACT

Estimating how uncertain an AI system is in its predictions is important to improve the safety of such systems. Uncertainty in predictive can result from uncertainty in model parameters, irreducible data uncertainty and uncertainty due to distributional mismatch between the test and training data distributions. Different actions might be taken depending on the source of the uncertainty so it is important to be able to distinguish between them. Recently, baseline tasks and metrics have been defined and several practical methods to estimate uncertainty developed. These methods, however, attempt to model uncertainty due to distributional mismatch either implicitly through model uncertainty or as data uncertainty. This work proposes a new framework for modeling predictive uncertainty called Prior Networks (PNs) which explicitly models distributional uncertainty. PNs do this by parameterizing a prior distribution over predictive distributions. This work focuses on uncertainty for classification and evaluates PNs on the tasks of identifying out-of-distribution (OOD) samples and detecting misclassification on the MNIST dataset, where they are found to outperform previous methods. Experiments on synthetic and MNIST and CIFAR-10 data show that unlike previous non-Bayesian methods PNs are able to distinguish between data and distributional uncertainty.

연구 동기 및 목표

예측 불확실성의 세 가지 원천인 모델(epistemic), 데이터(aleatoric), 및 분포형(dataset shift)으로 예측 불확실성을 구분할 필요성을 제시한다.
분포 불확실성을 고립시키기 위해 예측 분포의 분포를 매개변수화하는 Prior Networks를 제안한다.
OID 탐지 및 오분류 탐지에 초점을 맞춘 Dirichlet Prior Networks (DPNs)를 개발하고 평가한다.
PN 프레임워크에서 얻은 불확실성 측정을 제안하고, Bayesian/멀티 모델 베이스라인과 비교한다.

제안 방법

Prior Networks (PNs)를 도입하여 예측 분포 p(mu|x, theta)의 분포를 명시적으로 모델링한다.
Dirichlet 분포를 사용하여 p(mu|x; theta)를 매개변수화하고, alpha = f(x; theta)로 설정하여 도메인 내 예측의 확신도에 대한 급한 모서리와 OOD 입력에 대한 평평한 사전분포를 가능하게 한다.
다중 작업 목적 함수를 통해 DPNs를 학습하며, 도메인 내 데이터에 대해 예리한 Dirichlet 타깃에 대한 KL 발산을 최소화하고, 도메인 외 데이터에 대해 평평한 Dirichlet 타깃에 대한 KL 발산을 최소화한다 (eq. 12).
delta 함수 타깃을 피하기 위해 in-distribution 타깃을 정규화하고 평활화하며 (eq. 15), 필요에 따라 Teacher-Student 스무딩을 사용한다.
PN 계층의 다양한 주변화(데이터, 분포형, 모델 불확실성)에 대해 논의하고 이러한 주변화로부터 불확실성 척도(entropy, mutual information)를 도출한다.
합성 데이터, MNIST, CIFAR-10에서 PN/Dirichlet PN을 평가하고 표준 DNN 및 MC-Dropout 엔SEMBLE과 비교한다.

실험 결과

연구 질문

RQ1Prior Networks가 분류 작업에서 데이터 불확실성, 분포형 불확실성, 및 모델 불확실성을 각각 독립적으로 모델링할 수 있는가?
RQ2Dirichlet Prior Networks가 DNN 및 MC-Dropout 엔SEMBLE과 같은 기준선에 비해 OOD 탐지 및 오분류 탐지에 개선을 보이는가?
RQ3PN 프레임워크에서 어떤 불확실성 척도(entropy, mutual information, differential entropy)가 서로 다른 불확실성 원천을 가장 잘 반영하는가?
RQ4PN 기반 방법은 MNIST 및 CIFAR-10에서 노이즈/확장 시나리오 및 다양한 실제 OOD 데이터셋에서 어떻게 성능을 보이는가?

주요 결과

Dirichlet Prior Networks는 MNIST/CIFAR-10에서 OOD 탐지를 위한 MC-Dropout 및 표준 DNN보다 더 정확한 분포 불확실성 추정을 보인다.
PNs는 MNIST 및 CIFAR-10 전반에서 오분류 탐지에서 기준선보다 우수하다.
Dirichlet 선행 분포의 미분 엔트로피는 클래스 구분이 약하거나 노이즈가 있을 때 OOD 탐지에 특히 효과적이다.
PN 프레임워크에서 도출된 불확실성 척도는 테스트 시점에 해석적으로 계산 가능하여 앙상블보다 계산 비용이 낮다.
합성 데이터에서 PN의 분포 내/분포 외 구분 능력은 클래스 중첩이 높아질수록 향상되며, 표준 엔트로피 측정부다.
엔트로피와 최대 사후확률은 여전히 강력한 간단 지표로 남아 있으며, differential entropy는 특정 OOD 시나리오에서 이점(특히 구분이 덜 뚜렷한 클래스에서)을 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.