QUICK REVIEW

[논문 리뷰] Improving Reconstruction Autoencoder Out-of-distribution Detection with Mahalanobis Distance

Taylor Denouden, Rick Salay|arXiv (Cornell University)|2018. 12. 06.

Anomaly Detection Techniques and Applications참고 문헌 1인용 수 76

한 줄 요약

이 논문은 재구성 오토인코더가 특정 OOD 샘플을 놓칠 수 있음을 보이고 재구성 오차와 잠재 공간의 Mahalanobis 거리를 결합한 하이브리드 신사 점수를 제안하여 OOD 탐지를 개선한다.

ABSTRACT

There is an increasingly apparent need for validating the classifications made by deep learning systems in safety-critical applications like autonomous vehicle systems. A number of recent papers have proposed methods for detecting anomalous image data that appear different from known inlier data samples, including reconstruction-based autoencoders. Autoencoders optimize the compression of input data to a latent space of a dimensionality smaller than the original input and attempt to accurately reconstruct the input using that compressed representation. Since the latent vector is optimized to capture the salient features from the inlier class only, it is commonly assumed that images of objects from outside of the training class cannot effectively be compressed and reconstructed. Some thus consider reconstruction error as a kind of novelty measure. Here we suggest that reconstruction-based approaches fail to capture particular anomalies that lie far from known inlier samples in latent space but near the latent dimension manifold defined by the parameters of the model. We propose incorporating the Mahalanobis distance in latent space to better capture these out-of-distribution samples and our results show that this method often improves performance over the baseline approach.

연구 동기 및 목표

자율주행 차량과 같은 안전-중요 시스템에서 신뢰할 수 있는 OOD 탐지의 필요성을 제시한다.
재구성 오차를 OOD 탐지를 위한 단일 신규성 지표로 삼는 데 한계점을 보여준다.
잠재 공간의 Mahalanobis 거리와 재구성 오차를 결합한 하이브리드 점수 방법을 제안한다.
MNIST의 인라이어 클래스를 전반에 걸쳐 하이브리드 접근이 OOD 탐지 성능을 개선하는지 평가한다.

제안 방법

모델당 하나의 인라이어 클래스를 사용하여 MNIST 숫자에 대해 재구성 오토인코더를 훈련한다.
잠재 공간 Mahalanobis 거리와 재구성 오차의 가중 합으로 신규성을 계산한다.
훈련 데이터의 잠재 인코딩의 평균과 공분산으로 Mahalanobis 거리를 매개화한다.
두 성분의 균형을 맞추기 위해 검증 세트에서 혼합 매개변수 alpha와 beta를 조정한다.
여러 bottleneck 크기에 대해 기준 재구성 오차와 하이브리드 점수를 비교한다.
성능 평가를 위해 표준 OOD 지표(AUROC, AUPR, FPR at 95% TPR)를 사용한다.

실험 결과

연구 질문

RQ1오토인코더가 OOD 샘플을 낮은 재구성 오차로 재구성하여, 재구성 오차만으로의 OOD 탐지를 약화시킬 수 있는가?
RQ2잠재 공간의 Mahalanobis 거리를 도입하는 것이 이러한 OOD 샘플의 탐지에 도움이 되는가?
RQ3병목 크기가 하이브리드 OOD 점수의 효과에 미치는 영향은 무엇인가?
RQ4재구성 오차 단독에 비해 하이브리드 접근이 일반적인 OOD 지표(AUROC, AUPR, FPR 95%TPR)를 일관되게 향상시키는가?

주요 결과

잠재 공간 Mahalanobis 거리와 재구성 오차를 결합하면 재구성 오차만을 사용할 때보다 OOD 탐지 성능이 자주 개선된다.
하이브리드 접근의 최상의 병목 크기는 숫자에 걸쳐 대략 8에서 64 사이이다.
대부분의 인라이어 클래스에서 하이브리드 점수는 기준선에 비해 95% TPR에서 더 낮은 FPR과 더 높은 AUROC/AUPR를 보인다.
alpha와 beta에 대한 정규화 전략은 어느 한 특징도 신규성 점수를 지배하지 않도록 한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.