[论文解读] Reducing Network Agnostophobia
本文提出 Entropic Open-Set 与 Objectosphere 损失,通过为未知样本创造高熵并通过特征幅度将已知样本与未知样本分离来改进开放集识别,采用基于 OSCR 的评估并提供公开代码。
Agnostophobia, the fear of the unknown, can be experienced by deep learning engineers while applying their networks to real-world applications. Unfortunately, network behavior is not well defined for inputs far from a networks training set. In an uncontrolled environment, networks face many instances that are not of interest to them and have to be rejected in order to avoid a false positive. This problem has previously been tackled by researchers by either a) thresholding softmax, which by construction cannot return "none of the known classes", or b) using an additional background or garbage class. In this paper, we show that both of these approaches help, but are generally insufficient when previously unseen classes are encountered. We also introduce a new evaluation metric that focuses on comparing the performance of multiple approaches in scenarios where such unseen classes or unknowns are encountered. Our major contributions are simple yet effective Entropic Open-Set and Objectosphere losses that train networks using negative samples from some classes. These novel losses are designed to maximize entropy for unknown inputs while increasing separation in deep feature space by modifying magnitudes of known and unknown samples. Experiments on networks trained to classify classes from MNIST and CIFAR-10 show that our novel loss functions are significantly better at dealing with unknown inputs from datasets such as Devanagari, NotMNIST, CIFAR-100, and SVHN.
研究动机与目标
- 在深度网络的实际部署中激发未知输入问题(agnostophobia)的动机。
- 开发损失函数,以在不完全依赖背景类的情况下提高对未见类别的鲁棒性。
- 提出一个专为开放集场景设计的评估指标(OSCR)。
- 在 MNIST/CIFAR 数据集上展示相对于 softmax 阈值和背景类基线的改进。
提出的方法
- 定义 Entropic Open-Set 损失,以最大化背景样本的 softmax 熵。
- 通过在保持已知样本具有较大幅度的同时,进一步最小化未知样本的深度特征幅度,将其扩展为 Objectosphere 损失。
- 理论证明 Entropic Open-Set 损失在未知表示坍缩为零幅度且各类别对数同等时达到最小。
- 显示 Objectosphere 在已知和未知样本之间强制执行特征幅度的边界。
- 用 Open-Set Classification Rate (OSCR) 曲线进行评估,以比较拒识和识别性能。
实验结果
研究问题
- RQ1通过塑形特征表示的损失是否能够超越 softmax 阈值化和背景类方法?
- RQ2Entropic Open-Set 与 Objectosphere 损失是否能为未知样本带来更高熵、并实现已知样本与未知样本之间更大的幅度分离?
- RQ3是否存在一个公正的评估指标(OSCR)来跨数据集公平比较开放集方法?
主要发现
| 实验/架构 | 未知样本 D_a | 数据集 | 算法 | CCR 在 FPR 1e-4 时 | CCR 在 FPR 1e-3 时 | CCR 在 FPR 1e-2 时 | CCR 在 FPR 1e-1 时 |
|---|---|---|---|---|---|---|
| LeNet++ with MNIST-CIFAR setup | Devanagri (MNIST c. background) | Softmax | 0.0 | 0.0 | 0.0777 | 0.9007 |
| LeNet++ with MNIST-CIFAR setup | Devanagri | Background | 0.0 | 0.4402 | 0.7527 | 0.9313 |
| LeNet++ with MNIST-CIFAR setup | NotMNIST | Entropic Open-Set | 0.7142 | 0.8746 | 0.9580 | 0.9788 |
| LeNet++ with MNIST-CIFAR setup | NotMNIST | Objectosphere | 0.7350 | 0.9108 | 0.9658 | 0.9791 |
| NotMNIST experiment (MNIST baseline) | NotMNIST | Softmax | 0.0 | 0.3397 | 0.4954 | 0.8288 |
| NotMNIST experiment (MNIST baseline) | NotMNIST | Background | 0.3806 | 0.7179 | 0.9068 | 0.9624 |
| NotMNIST experiment (MNIST baseline) | NotMNIST | Entropic Open-Set | 0.4201 | 0.8578 | 0.9515 | 0.9780 |
| NotMNIST experiment (MNIST baseline) | NotMNIST | Objectosphere | 0.5120 | 0.8965 | 0.9563 | 0.9773 |
| ResNet-18 CIFAR-10 setup | SVHN (Unknowns) | Softmax | 0.1924 | 0.2949 | 0.4599 | 0.6473 |
| ResNet-18 CIFAR-10 setup | SVHN (Unknowns) | Background | 0.2012 | 0.3022 | 0.4803 | 0.6981 |
| ResNet-18 CIFAR-10 setup | SVHN (Unknowns) | Entropic Open-Set | 0.1071 | 0.2338 | 0.4277 | 0.6214 |
| ResNet-18 CIFAR-10 setup | SVHN (Unknowns) | Objectosphere | 0.1862 | 0.3387 | 0.5074 | 0.6886 |
| CIFAR-100 subset experiment | CIFAR-100 Subset 4500 | Scaled Objectosphere | N/A | N/A | N/A | N/A |
- Entropic Open-Set 损失提高未知/背景样本的熵,并改善对未知输入的处理。
- Objectosphere 损失进一步提高 softmax 熵,并加强已知与未知样本之间的幅度分离。
- 这两种损失在 MNIST/CIFAR-10 派生任务上优于 softmax 阈值化和背景类基线,未知样本来自 Devanagari、NotMNIST、CIFAR-100 和 SVHN。
- OSCR 曲线提供基于阈值的评估,聚焦于已知与未知的拒识和正确的已知类别分类。
- 实验显示 Entropic Open-Set 和 Objectosphere 在多数据集上的固定 FPR 下有更高的 CCR。
- 实现该方法的代码公开可用。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。