QUICK REVIEW

[論文レビュー] Open-Set Recognition: a Good Closed-Set Classifier is All You Need?

Sagar Vaze, Kai Han|arXiv (Cornell University)|Oct 12, 2021

Domain Adaptation and Few-Shot Learning参考文献 61被引用数 40

ひとこと要約

この論文は、クローズドセットの精度とオープンセット認識（OSR）性能との間に強い実証的関連があることを示し、標準的な画像分類手法でクローズドセット精度を向上させると、ImageNetの大規模分割を含む最先端のOSR結果を得られることを実証します。さらに、OSRにおける意味的 novelty をより良く評価するための Semantic Shift Benchmark（SSB）を導入します。

ABSTRACT

The ability to identify whether or not a test sample belongs to one of the semantic classes in a classifier's training set is critical to practical deployment of the model. This task is termed open-set recognition (OSR) and has received significant attention in recent years. In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes. We find that this relationship holds across loss objectives and architectures, and further demonstrate the trend both on the standard OSR benchmarks as well as on a large-scale ImageNet evaluation. Second, we use this correlation to boost the performance of a maximum logit score OSR 'baseline' by improving its closed-set accuracy, and with this strong baseline achieve state-of-the-art on a number of OSR benchmarks. Similarly, we boost the performance of the existing state-of-the-art method by improving its closed-set accuracy, but the resulting discrepancy with the strong baseline is marginal. Our third contribution is to present the 'Semantic Shift Benchmark' (SSB), which better respects the task of detecting semantic novelty, in contrast to other forms of distribution shift also considered in related sub-fields, such as out-of-distribution detection. On this new evaluation, we again demonstrate that there is negligible difference between the strong baseline and the existing state-of-the-art. Project Page: https://www.robots.ox.ac.uk/~vgg/research/osr/

研究の動機と目的

クローズドセットの性能が、データセットやアーキテクチャ全体でオープンセット検出と強く相関することを示す。
MSPベースラインのクローズドセット精度を改善することが、OSRの最先端結果をもたらすことを示す。
従来のオープネス指標を超える、意味論を考慮したOSR評価スイート（Semantic Shift Benchmark）を提案する。

提案手法

MSPベースライン、ARPL、および ARPL+CS を、複数のデータセットに跨る標準OSRベンチマークで比較する。
データセットとアーキテクチャ全体で、クローズドセット精度とオープンセットAUROCの相関を定量化する。
長時間の訓練、データ拡張、ラベル平滑化を用いてMSPベースラインを改善し、クローズドセット精度を向上させる（MSP+）。
最大対数ロジットスコア（MLS）をソフトマックス確率の代わりにオープンセット指標として提案する。
MLSと強化されたベースラインを、easy/ hardな意味的オープンセット集合を含むImageNetの大規模分割で評価する。
ImageNet規模と細粒度FGVCデータセットを含むSemantic Shift Benchmark (SSB)を導入し、意味的新規性を評価する。

実験結果

リサーチクエスチョン

RQ1データセットやモデルファミリ全体で、クローズドセット精度はオープンセット検出性能と相関しますか？
RQ2OSRのベースライン手法のクローズドセット精度を改善することで、最先端手法と競合する、あるいはそれを上回るOSR性能を得られますか？
RQ3最大ロジットに基づくオープンセットスコアリング規則（MLS）は、OSRにおける最大ソフトマックス確率（MSP）とどう比較されますか？
RQ4意味論を考慮したオープンセット分割が、単なるオープネスだけと比べてOSR評価に与える影響はどの程度ですか？
RQ5提案されたSemantic Shift Benchmarkは、大規模で意味論中心のOSR評価フレームワークとして意味のあるものですか？

主な発見

方法	MNIST	SVHN	CIFAR10	CIFAR+10	CIFAR+50	TinyImageNet
Baseline (MSP)	97.8	88.6	67.7	81.6	80.5	57.7
OSRCI	98.8	91.0	69.9	83.8	82.7	58.6
OpenHybrid	99.5	94.7	95.0	96.2	95.5	79.3
ARPL + CS	99.7	96.7	91.0	97.1	95.1	78.2
OSRCI+	98.5 (-0.3)	89.9 (-1.1)	87.2 (+7.3)	91.1 (+7.3)	90.3 (+7.6)	62.6 (+4.0)
(ARPL + CS)+	99.2 (-0.5)	96.8 (+0.1)	93.9 (+2.9)	98.1 (+1.0)	96.7 (+1.6)	82.5 (+4.3)
Baseline (MSP+)	98.6 (+0.8)	96.0 (+7.4)	90.1 (+22.4)	95.6 (+14.0)	94.0 (+13.5)	82.7 (+25.0)
Baseline (MLS)	99.3 (+1.5)	97.1 (+8.5)	93.6 (+25.9)	97.9 (+16.3)	96.5 (+16.0)	83.0 (+25.3)

標準ベンチマークでのクローズドセット精度とオープンセットAUROCの間には強い正の相関がある（Pearson ρ ≈ 0.95、ImageNet Easy/Hard分割では≈0.88/0.63）。
標準の画像分類改善を用いてMSPベースラインを強化することで、ほとんどのベンチマークで最先端または競争力のあるOSR結果を得られる（例：MSP+および MLS はいくつかのベースラインを上回る）。
最大ロジットスコア（MLS）はオープンセット指標としてMSPベースラインより大幅な改善を提供し、MLSはデータセット全体で平均AUROCが優れる。
Semantic Shift Benchmarkでは、MLSとARPL+が同等の性能を示し、意味論認識の分割がOSR評価にとって重要であることを強調している。
提案されたSemantic Shift Benchmarkは、意味的に難しい分割が、単純なオープンネス指標よりもOSR性能をより低下させることを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。