QUICK REVIEW

[論文レビュー] Leveraging Uncertainty Estimates for Predicting Segmentation Quality

Terrance DeVries, Graham W. Taylor|ArXiv.org|Jul 2, 2018

Explainable Artificial Intelligence (XAI)参考文献 26被引用数 58

ひとこと要約

著者らはピクセルレベルの不確実性マップを用いて画像レベルのセグメンテーション品質を予測する二段階フレームワークを提案し、ISIC 2017データセットで皮膚病変セグメンテーションの不確実性推定手法を比較する。

ABSTRACT

The use of deep learning for medical imaging has seen tremendous growth in the research community. One reason for the slow uptake of these systems in the clinical setting is that they are complex, opaque and tend to fail silently. Outside of the medical imaging domain, the machine learning community has recently proposed several techniques for quantifying model uncertainty (i.e.~a model knowing when it has failed). This is important in practical settings, as we can refer such cases to manual inspection or correction by humans. In this paper, we aim to bring these recent results on estimating uncertainty to bear on two important outputs in deep learning-based segmentation. The first is producing spatial uncertainty maps, from which a clinician can observe where and why a system thinks it is failing. The second is quantifying an image-level prediction of failure, which is useful for isolating specific cases and removing them from automated pipelines. We also show that reasoning about spatial uncertainty, the first output, is a useful intermediate representation for generating segmentation quality predictions, the second output. We propose a two-stage architecture for producing these measures of uncertainty, which can accommodate any deep learning-based medical segmentation pipeline.

研究の動機と目的

不確実性推定を医用画像セグメンテーションにおいて人間の介在する意思決定を支援するために用いる動機づけ。
空間的不確実性マップと画像レベルのセグメンテーション品質予測を共同で生成するモジュラーな二段階アーキテクチャを開発。
セグメン테ーション品質予測の有効性を判断するために複数の不確実性推定手法を評価。
明示的な空間的不確実性がベースラインよりセグメンテーション品質予測を改善することを示す。

提案手法

意味的セグメンテーションモデル f を訓練し、ピクセルごとのロジットと不確実性マップ z を出力。
f からピクセルごとの予測セグメンテーション yhat を計算。
二つ目のネットワーク g を訓練し、(x, yhat, z) からセグメンテーション品質指標 v（例：Jaccard 指数）を予測。
不確実性マップを以下の四つの方法のいずれかで抽出：最大ソフトマックス確率、MC-ドロップアウト、ヘテロスキャデスティック分類器ニューラルネットワーク（HCNN）、あるいは学習済み信頼度推定（LCE）。
不確実性推定に基づく品質予測を、RMSE、検出エラー、AUROC、AUPR でベースラインと比較して評価。

実験結果

リサーチクエスチョン

RQ1ピクセルレベルの不確実性マップは画像レベルのセグメンテーション品質予測の精度を向上させるか。
RQ2医用画像設定で最も良いセグメンテーション品質予測を生み出す不確実性推定技術はどれか。
RQ3g への z（不確実性）の組み込みは、(x, yhat) のみを用いた場合と比較してセグメンテーション品質予測にどのような違いをもたらすか。
RQ4二段階アプローチは異なる不確実性推定法に対してロバストか。
RQ5ISIC 2017 皮膚病変セグメンテーションにおける提案手法のRCAおよびQualityNetとの相対的な性能はどうか。

主な発見

手法	RMSE	検出エラー	AUROC	AUPR-パス	AUPR-失敗
RCA	0.438 ± 0.007	43.8 ± 1.0	53.7 ± 1.4	74.4 ± 1.4	30.7 ± 0.9
QualityNet	0.213 ± 0.009	25.7 ± 2.7	80.9 ± 3.1	89.0 ± 2.3	69.1 ± 4.7
No Uncertainty	0.198 ± 0.011	27.3 ± 3.3	79.8 ± 3.8	88.5 ± 1.9	66.4 ± 7.9
Max Probability	0.168 ± 0.014	18.4 ± 3.0	88.4 ± 2.2	93.2 ± 1.6	80.5 ± 3.2
MC-dropout	0.163 ± 0.010	18.8 ± 1.4	88.1 ± 0.8	93.5 ± 1.3	78.1 ± 3.0
HCNN	0.196 ± 0.023	21.3 ± 1.8	85.5 ± 1.5	91.6 ± 1.4	76.2 ± 4.5
LCE	0.167 ± 0.019	19.3 ± 1.1	88.3 ± 1.4	93.6 ± 1.5	79.1 ± 3.9

明示的な不確実性情報を活用することで、不確実性なしのベースラインよりセグメンテーション品質推定が改善される。
最大ソフトマックス確率、MC-ドロップアウト、および学習済み信頼度推定は、この設定でベースラインを上回る改善を提供するのと同程度の性能を示す。
HCNN はセグメンテーション品質予測の不確実性手法の中で最小の改善を示す。
RCA は皮膚病変の外観が大きくばらつくため ISIC 2017 データで性能が低い。
QualityNet は本研究では不確実性なしベースラインとおおよそ同等の性能。
不確実性を組み込む二段階フレームワークはRMSEと検出エラーを一貫して低減する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。