QUICK REVIEW

[论文解读] Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective

Jing Zhang, Tong Zhang|arXiv (Cornell University)|Mar 29, 2018

Visual Attention and Saliency Detection参考文献 42被引用 65

一句话总结

该论文提出一种端到端的深度显著性检测器，通过学习来自多种嘈杂的无监督显著性地图的方式，在没有人工标注的情况下进行训练；实现了联合潜在显著性预测和显式噪声建模。

ABSTRACT

The success of current deep saliency detection methods heavily depends on the availability of large-scale supervision in the form of per-pixel labeling. Such supervision, while labor-intensive and not always possible, tends to hinder the generalization ability of the learned models. By contrast, traditional handcrafted features based unsupervised saliency detection methods, even though have been surpassed by the deep supervised methods, are generally dataset-independent and could be applied in the wild. This raises a natural question that "Is it possible to learn saliency maps without using labeled data while improving the generalization ability?". To this end, we present a novel perspective to unsupervised saliency detection through learning from multiple noisy labeling generated by "weak" and "noisy" unsupervised handcrafted saliency methods. Our end-to-end deep learning framework for unsupervised saliency detection consists of a latent saliency prediction module and a noise modeling module that work collaboratively and are optimized jointly. Explicit noise modeling enables us to deal with noisy saliency maps in a probabilistic way. Extensive experimental results on various benchmarking datasets show that our model not only outperforms all the unsupervised saliency methods with a large margin but also achieves comparable performance with the recent state-of-the-art supervised deep saliency methods.

研究动机与目标

以无像素级标签的方式提升泛化能力，动机是无监督显著性学习。
利用多种无监督显著性映射作为嘈杂标签来训练深度模型。
在端到端框架中联合优化潜在显著性预测器和噪声模型。

提出的方法

两模块架构：一个潜在显著性预测模块（基于FCN/DeepLab）和一个噪声建模模块。
将每个手工构造的无监督标签建模为 y_i^j = y_bar_i + n_i^j，其中 n_i^j 从像素级零均值高斯分布 q_i(Σ) 中抽取。
损失函数将显著性预测损失（预测标签与嘈杂标签之间的交叉熵）与噪声损失（q_i 与经验噪声的KL散度）结合。
通过基于KL的更新在每张图像上更新噪声方差，从而在多轮迭代中实现渐进改进。
训练使用 DeepLab/ResNet-101 进行端到端优化；测试使用潜在预测的显著性图，不使用噪声模块。
理论与实践设计选择包括将输出截断到 [0,1]、基于轮次的噪声更新，以及带动量的 SGD。

实验结果

研究问题

RQ1在没有人工标注的情况下，是否可以从多种嘈杂的无监督标签中学习显著性图？
RQ2相比简单融合或弱监督，显式的噪声建模是否能提升无监督深度显著性检测的质量？
RQ3潜在显著性预测器与噪声模型之间需要多少次迭代才能收敛？
RQ4所提出的无监督方法在基准数据集上与有监督的深度显著性方法和传统无监督方法相比有何表现？

主要发现

该方法在现有无监督显著性方法中取得显著更高的性能提升。
在基准数据集上，其性能与最先进的有监督显著性检测器高度竞争。
消融研究表明在轮次之间交替更新潜在预测器与噪声模型可提升性能，若干轮后收敛。
在七个基准数据集和多种评测指标（MAE、F-measure、PR）上均取得优异结果。
定性结果显示在挑战性场景（低对比度、复杂背景）中能稳健地恢复显著对象。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。