QUICK REVIEW

[论文解读] Leveraging Uncertainty Estimates for Predicting Segmentation Quality

Terrance DeVries, Graham W. Taylor|ArXiv.org|Jul 2, 2018

Explainable Artificial Intelligence (XAI)参考文献 26被引用 58

一句话总结

作者提出一个两阶段框架，使用像素级不确定性图来预测图像级分割质量，并在皮肤病变分割上比较多种不确定性估计方法。

ABSTRACT

The use of deep learning for medical imaging has seen tremendous growth in the research community. One reason for the slow uptake of these systems in the clinical setting is that they are complex, opaque and tend to fail silently. Outside of the medical imaging domain, the machine learning community has recently proposed several techniques for quantifying model uncertainty (i.e.~a model knowing when it has failed). This is important in practical settings, as we can refer such cases to manual inspection or correction by humans. In this paper, we aim to bring these recent results on estimating uncertainty to bear on two important outputs in deep learning-based segmentation. The first is producing spatial uncertainty maps, from which a clinician can observe where and why a system thinks it is failing. The second is quantifying an image-level prediction of failure, which is useful for isolating specific cases and removing them from automated pipelines. We also show that reasoning about spatial uncertainty, the first output, is a useful intermediate representation for generating segmentation quality predictions, the second output. We propose a two-stage architecture for producing these measures of uncertainty, which can accommodate any deep learning-based medical segmentation pipeline.

研究动机与目标

Motivate the use of uncertainty estimates in medical image segmentation to support human-in-the-loop decision making.
Develop a modular two-stage architecture that jointly produces spatial uncertainty maps and an image-level segmentation quality prediction.
Evaluate multiple uncertainty estimation methods to determine their effectiveness for predicting segmentation quality.
Demonstrate that explicit spatial uncertainty improves segmentation quality prediction over baselines.

提出的方法

Train a semantic segmentation model f to output per-pixel logits and an uncertainty map z.
Compute a per-pixel predicted segmentation yhat from f.
Train a second network g to predict a segmentation quality metric v (e.g., Jaccard index) from (x, yhat, z).
Extract uncertainty maps using one of four methods: Maximum Softmax Probability, MC-Dropout, Heteroscedastic Classifier Neural Network (HCNN), or Learned Confidence Estimates (LCE).
Evaluate uncertainty-derived quality predictions against baselines using RMSE, detection error, AUROC, and AUPR.

实验结果

研究问题

RQ1Can pixel-level uncertainty maps improve the accuracy of image-level segmentation quality prediction?
RQ2Which uncertainty estimation techniques yield the best segmentation quality predictions in a medical imaging setting?
RQ3How does incorporating z (uncertainty) into g compare to using only (x, yhat) for predicting segmentation quality?
RQ4Is the two-stage approach robust across different uncertainty estimation methods?
RQ5What is the relative performance of the proposed method versus RCA and QualityNet for ISIC 2017 skin lesion segmentation?

主要发现

Method	RMSE	Detection Error	AUROC	AUPR-Pass	AUPR-Fail
RCA	0.438 ± 0.007	43.8 ± 1.0	53.7 ± 1.4	74.4 ± 1.4	30.7 ± 0.9
QualityNet	0.213 ± 0.009	25.7 ± 2.7	80.9 ± 3.1	89.0 ± 2.3	69.1 ± 4.7
No Uncertainty	0.198 ± 0.011	27.3 ± 3.3	79.8 ± 3.8	88.5 ± 1.9	66.4 ± 7.9
Max Probability	0.168 ± 0.014	18.4 ± 3.0	88.4 ± 2.2	93.2 ± 1.6	80.5 ± 3.2
MC-dropout	0.163 ± 0.010	18.8 ± 1.4	88.1 ± 0.8	93.5 ± 1.3	78.1 ± 3.0
HCNN	0.196 ± 0.023	21.3 ± 1.8	85.5 ± 1.5	91.6 ± 1.4	76.2 ± 4.5
LCE	0.167 ± 0.019	19.3 ± 1.1	88.3 ± 1.4	93.6 ± 1.5	79.1 ± 3.9

Using explicit uncertainty information improves segmentation quality estimation over a no-uncertainty baseline.
Maximum softmax probability, MC-Dropout, and Learned Confidence Estimates perform similarly in this setting, offering improvements over baselines.
HCNN shows the least improvement among the uncertainty methods for segmentation quality prediction.
RCA performs poorly on ISIC 2017 data due to high variability in lesion appearance.
QualityNet performs roughly on par with the no-uncertainty baseline in this study.
The two-stage framework consistently reduces RMSE and detection error when uncertainty is incorporated.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。