QUICK REVIEW

[論文レビュー] Scaling Out-of-Distribution Detection for Real-World Settings

Dan Hendrycks, Steven Basart|arXiv (Cornell University)|Nov 25, 2019

Anomaly Detection Techniques and Applications参考文献 35被引用数 179

ひとこと要約

この論文は、単純な MaxLogit 検出器が MSP を、大規模な多クラス・多ラベル・異常セグメンテーションの OOD タスクで上回ることを示し、現実的な OOD 評価のための新しいベンチマーク (Species と CAOS) を導入する。

ABSTRACT

Detecting out-of-distribution examples is important for safety-critical machine learning applications such as detecting novel biological phenomena and self-driving cars. However, existing research mainly focuses on simple small-scale settings. To set the stage for more realistic out-of-distribution detection, we depart from small-scale settings and explore large-scale multiclass and multi-label settings with high-resolution images and thousands of classes. To make future work in real-world settings possible, we create new benchmarks for three large-scale settings. To test ImageNet multiclass anomaly detectors, we introduce the Species dataset containing over 700,000 images and over a thousand anomalous species. We leverage ImageNet-21K to evaluate PASCAL VOC and COCO multilabel anomaly detectors. Third, we introduce a new benchmark for anomaly segmentation by introducing a segmentation benchmark with road anomalies. We conduct extensive experiments in these more realistic settings for out-of-distribution detection and find that a surprisingly simple detector based on the maximum logit outperforms prior methods in all the large-scale multi-class, multi-label, and segmentation tasks, establishing a simple new baseline for future work.

研究の動機と目的

現実的で大規模な設定における OOD 検出を、小規模なベンチマークを超えて動機づける。
大規模な多クラス（ImageNet-21K）、多ラベル、およびセグメンテーションの OOD シナリオのベンチマークを作成する。
既存のベースラインを評価し、現実世界の OOD 検出のための単純で強力なベースラインを確立する。
ImageNet-21K で事前学習された Vision Transformer が大規模設定での OOD 検出を本質的に解決するかを検討する。

提案手法

MaxLogit を提案する: 最大の正規化前ロジットの負数を OOD スコアとして用い、クラス数のバイアスを回避する。
Species データセットを構築する: 訓練／テストの重複なしで、70 万以上の画像と1000 以上の異常種を含む大規模で分離された OOD 集を作成し、OOD をテストする。
PASCAL VOC および MS-COCO 上で、20 個の ImageNet-21K OOD クラスを用いた多ラベル OOD セットアップを開発・評価し、MSP、LogitAvg、MaxLogit を比較する。
CAOS ベンチマークを作成し、StreetHazards（シミュレーションベースの異常）と BDD-Anomaly（現実世界の異常）を組み合わせて異常セグメンテーションを評価する。
StreetHazards および BDD-Anomaly に渡って、MaxLogit をベースライン（MSP、バックグラウンド、Dropout、Reconstruction AE）と比較する。
ResNet-50、ViT、Mixer のバックボーンを用いた ImageNet-21K-P 表現を活用し、OOD 検出性能を評価する。

実験結果

リサーチクエスチョン

RQ1数千クラスに及ぶ大規模 OOD 検出に MSP はスケールしきれないのか？
RQ2MaxLogit は大規模な多クラスおよび多ラベル OOD 検出において、より強力で普遍的なベースラインか？
RQ3ImageNet-21K で事前学習された Vision Transformer は大規模設定で本質的に OOD 検出を解決するのか？
RQ4現実世界の条件で OOD を評価するための現実的なベンチマーク（Species、CAOS）を構築できるか？
RQ5運転シーンにおける異常セグメンテーションでの OOD 検出器の性能はどうか？

主な発見

MaxLogit は大規模な多クラス・多ラベル・異常セグメンテーションタスク全般で、MSP および他のベースラインを一貫して上回る。
Species データセットは、ImageNet-21K で事前学習された Vision Transformer が、データリークを回避する慎重な評価を行わない限り、単純には OOD 検出を解決しないことを示す。
MaxLogit は多ラベル設定にも良く一般化し、MSP、LogitAvg、従来の検出器を上回る。
CAOS ベンチマークは、MaxLogit がピクセル単位の異常セグメンテーションで MSP、バックグラウンド、Dropout、AE のベースラインを上回る最良の性能を示す。
StreetHazards と BDD-Anomaly の両方において、MaxLogit は堅牢で一貫した改善をもたらし、現実世界の OOD 検出のための堅固なベースラインを示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。