[论文解读] Open-Set Recognition: a Good Closed-Set Classifier is All You Need?
本文显示了封闭集准确度与开放集识别(OSR)性能之间的强经验相关性,并证明使用标准图像分类技术提升封闭集准确度能够带来最先进的OSR结果,包括在大规模ImageNet分割上的表现。它还引入语义转移基准(Semantic Shift Benchmark,SSB)以更好地评估OSR中的语义新颖性。
The ability to identify whether or not a test sample belongs to one of the semantic classes in a classifier's training set is critical to practical deployment of the model. This task is termed open-set recognition (OSR) and has received significant attention in recent years. In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes. We find that this relationship holds across loss objectives and architectures, and further demonstrate the trend both on the standard OSR benchmarks as well as on a large-scale ImageNet evaluation. Second, we use this correlation to boost the performance of a maximum logit score OSR 'baseline' by improving its closed-set accuracy, and with this strong baseline achieve state-of-the-art on a number of OSR benchmarks. Similarly, we boost the performance of the existing state-of-the-art method by improving its closed-set accuracy, but the resulting discrepancy with the strong baseline is marginal. Our third contribution is to present the 'Semantic Shift Benchmark' (SSB), which better respects the task of detecting semantic novelty, in contrast to other forms of distribution shift also considered in related sub-fields, such as out-of-distribution detection. On this new evaluation, we again demonstrate that there is negligible difference between the strong baseline and the existing state-of-the-art. Project Page: https://www.robots.ox.ac.uk/~vgg/research/osr/
研究动机与目标
- 证明封闭集表现与跨数据集和架构的开放集检测之间存在强相关性。
- 展示提高MSP基线的封闭集准确度可达到或超过最先进的OSR结果。
- 提出一个统一的、语义感知的OSR评估框架(Semantic Shift Benchmark),超越传统的开放性度量。
提出的方法
- 在多数据集上比较MSP基线、ARPL与ARPL+CS在标准OSR基准上的表现。
- 量化跨数据集和架构的封闭集准确度与开放集AUROC之间的相关性。
- 通过更长的训练、增强数据和标签平滑来提升MSP基线的封闭集准确度(MSP+)。
- 提出将最大对数分数(MLS)作为开放集指示,而非softmax概率。
- 在大规模的ImageNet分割上评估MLS及增强基线,覆盖容易/困难的语义开放集合。
- 引入并评估语义转移基准(SSB),其包含ImageNet规模及细粒度FGVC数据集,用于评估语义新颖性。
实验结果
研究问题
- RQ1封闭集准确度是否在不同数据集和模型族中与开放集检测性能相关?
- RQ2提升基线OSR方法的封闭集准确度是否能获得与最先进方法相比的竞争性或更优的OSR性能?
- RQ3基于最大对数的开放集评分规则(MLS)与基于最大softmax概率(MSP)相比,OSR的表现如何?
- RQ4语义感知的开放集分割对OSR评估的影响是否优于单纯的开放度衡量?
- RQ5提出的语义转移基准是否在大规模下提供有意义的、以语义为中心的OSR评估框架?
主要发现
| Method | MNIST | SVHN | CIFAR10 | CIFAR+10 | CIFAR+50 | TinyImageNet |
|---|---|---|---|---|---|---|
| Baseline (MSP) | 97.8 | 88.6 | 67.7 | 81.6 | 80.5 | 57.7 |
| OSRCI | 98.8 | 91.0 | 69.9 | 83.8 | 82.7 | 58.6 |
| OpenHybrid | 99.5 | 94.7 | 95.0 | 96.2 | 95.5 | 79.3 |
| ARPL + CS | 99.7 | 96.7 | 91.0 | 97.1 | 95.1 | 78.2 |
| OSRCI+ | 98.5 (-0.3) | 89.9 (-1.1) | 87.2 (+7.3) | 91.1 (+7.3) | 90.3 (+7.6) | 62.6 (+4.0) |
| (ARPL + CS)+ | 99.2 (-0.5) | 96.8 (+0.1) | 93.9 (+2.9) | 98.1 (+1.0) | 96.7 (+1.6) | 82.5 (+4.3) |
| Baseline (MSP+) | 98.6 (+0.8) | 96.0 (+7.4) | 90.1 (+22.4) | 95.6 (+14.0) | 94.0 (+13.5) | 82.7 (+25.0) |
| Baseline (MLS) | 99.3 (+1.5) | 97.1 (+8.5) | 93.6 (+25.9) | 97.9 (+16.3) | 96.5 (+16.0) | 83.0 (+25.3) |
- 封闭集准确度与开放集AUROC在基准之间存在强烈的正相关关系(标准基准上Pearson相关系数约为0.95,在ImageNet Easy/Hard分割上约为0.88/0.63)。
- 通过标准图像分类改进强化MSP基线,在大多数基准上实现了最先进或具有竞争力的OSR结果(例如,MSP+和MLS优于若干基线)。
- 将最大对数得分(MLS)作为开放集指示器,相较于MSP基线带来显著提升,MLS在多个数据集上获得更优的平均AUROC。
- 在语义转移基准上,MLS和ARPL+表现相当,强调语义感知的分割对OSR评估的重要性。
- 所提出的语义转移基准显示,语义难度更高的分割对OSR性能的负面影响比单纯的开放度度量所预测的要大。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。