QUICK REVIEW

[论文解读] Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder

Zhisheng Xiao, Qing Yan|arXiv (Cornell University)|Mar 6, 2020

Adversarial Robustness in Machine Learning参考文献 52被引用 64

一句话总结

本文提出 Likelihood Regret，一种用于 VAEs 的 OOD 检测分数，通过将每个样本的最佳后验配置与学习得到的总体 ELBO 进行比较，并展示 LR 在若干图像数据集上的 OOD 检测优于基于似然的方法。

ABSTRACT

Deep probabilistic generative models enable modeling the likelihoods of very high dimensional data. An important application of generative modeling should be the ability to detect out-of-distribution (OOD) samples by setting a threshold on the likelihood. However, some recent studies show that probabilistic generative models can, in some cases, assign higher likelihoods on certain types of OOD samples, making the OOD detection rules based on likelihood threshold problematic. To address this issue, several OOD detection methods have been proposed for deep generative models. In this paper, we make the observation that many of these methods fail when applied to generative models based on Variational Auto-encoders (VAE). As an alternative, we propose Likelihood Regret, an efficient OOD score for VAEs. We benchmark our proposed method over existing approaches, and empirical results suggest that our method obtains the best overall OOD detection performances when applied to VAEs.

研究动机与目标

为 VAEs 提供可靠的 OOD 检测动机，其中似然可能产生误导。
提出一个基于逐样本优化的分数（Likelihood Regret），以缓解似然对齐错位。
在多样的图像数据集上评估 LR 相对于现有 OOD 分数。
分析 LR 在不同 VAE 容量和 β-VAE 设置下的鲁棒性。

提出的方法

将 Likelihood Regret (LR) 定义为 LR(x)=L(x;θ*,τ̂(x))−L(x;θ*,φ*)，其中 L 是基于 ELBO 的对数似然。
通过估计 VAE 的 IWELBO（K 个样本）来计算 L，然后在 θ* 固定的单个输入上优化变分参数 τ 以最大化 L。
要么优化编码器 φ，要么直接优化 τ(x) 以获得 τ̂(x)。
通过限制潜在后验参数的改变，用 VAE 的瓶颈对优化进行正则化。
在多个 OOD 任务中，将 LR 与基线（Likelihood、IC、Likelihood Ratio、LMD）进行比较。

实验结果

研究问题

RQ1在标准似然性失败的 VAEs 中，LR 是否能可靠地区分分布内样本与 OOD 样本？
RQ2LR 相较于现有的 OOD 分数，在不同的分布内外样本对上表现如何？
RQ3LR 对不同的 VAE 容量和 β-VAE 设置是否鲁棒？
RQ4与其他 OOD 方法相比，LR 的计算权衡是什么？

主要发现

LR 纠正了 VAEs 中观察到的似然性错位，在大多数 OOD 任务中实现了高 AUC-ROC。
在 Fashion MNIST vs MNIST 上，LR 将 AUC-ROC 从 0.165（似然度）提升到 0.999。
在 CIFAR-10 vs SVHN 上，LR 将 AUC-ROC 从 0.161（似然度）提升到 0.876。
LR 的变体在对编码器进行优化（LR_E）和对潜在统计信息优化（LR_Z）都表现良好，通常 LR_E 更好。
LR 在 β-VAE 设置和不同容量的 VAE 下都具鲁棒性，尽管非常大容量在某些任务上可能略微降低表现。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。