Skip to main content
QUICK REVIEW

[论文解读] Fast Camouflaged Object Detection via Edge-based Reversible Re-calibration Network

Ge-Peng Ji, Lei Zhu|arXiv (Cornell University)|Nov 5, 2021
Visual Attention and Saliency Detection参考文献 98被引用 23
一句话总结

本文提出ERRNet,一种新型基于边缘的可逆重校准网络,用于实现快速且准确的伪装目标检测(COD)。通过整合选择性边缘聚合(SEA)与可逆重校准单元(RRU),ERRNet有效模拟了生物视觉感知机制,利用边缘与全局先验信息提升边界检测性能。在COD10K数据集上,ERRNet实现了86.7%的平均E-measure与79.3 FPS的推理速度,达到当前最先进水平,较先前方法(如SINet)在E-measure上提升约6%。

ABSTRACT

Camouflaged Object Detection (COD) aims to detect objects with similar patterns (e.g., texture, intensity, colour, etc) to their surroundings, and recently has attracted growing research interest. As camouflaged objects often present very ambiguous boundaries, how to determine object locations as well as their weak boundaries is challenging and also the key to this task. Inspired by the biological visual perception process when a human observer discovers camouflaged objects, this paper proposes a novel edge-based reversible re-calibration network called ERRNet. Our model is characterized by two innovative designs, namely Selective Edge Aggregation (SEA) and Reversible Re-calibration Unit (RRU), which aim to model the visual perception behaviour and achieve effective edge prior and cross-comparison between potential camouflaged regions and background. More importantly, RRU incorporates diverse priors with more comprehensive information comparing to existing COD models. Experimental results show that ERRNet outperforms existing cutting-edge baselines on three COD datasets and five medical image segmentation datasets. Especially, compared with the existing top-1 model SINet, ERRNet significantly improves the performance by $\sim$6% (mean E-measure) with notably high speed (79.3 FPS), showing that ERRNet could be a general and robust solution for the COD task.

研究动机与目标

  • 解决伪装目标边界模糊且与背景纹理高度相似所带来的检测挑战。
  • 通过生物启发机制整合全局与边缘先验信息,模拟类人视觉感知机制以提升目标检测性能。
  • 通过显式建模潜在目标与其周围环境之间的跨对比关系,提升检测准确率与推理速度。
  • 构建一种通用且鲁棒的框架,不仅适用于自然场景下的COD任务,还可拓展至医学图像分割任务。
  • 克服现有COD模型在充分挖掘边缘线索与上下文对比方面存在的局限性。

提出的方法

  • 引入选择性边缘聚合(SEA)模块,以增强边缘特征学习并防止低层次特征中的边缘信息退化。
  • 设计可逆重校准单元(RRU),以可逆且参数高效的方式,利用多种先验信息(邻域、全局、边缘、语义)对特征图进行重校准。
  • 在低层与高层特征图中同时引入NEGS先验(邻域、全局、边缘、语义),以引导检测并提升边界定位精度。
  • 采用双路架构,通过RRU模块实现基于边缘感知的跨对比,对全局先验提案进行精细化优化。
  • 采用标准分割损失函数进行端到端训练,实现特征学习与重校准的联合优化。
  • 通过在息肉与肺部感染数据集上微调,将模型适配至医学图像分割任务,并获得一致的性能提升。
Figure 2 : The overall pipeline of the proposed ERRNet that contains three main cooperative components, including Atrous Spatial Pyramid Pooling (ASPP) for initiating global prior, Selective Edge Aggregation (SEA) for generating edge prior, and Reversible Re-calibration Unit (RRU) for modulating and
Figure 2 : The overall pipeline of the proposed ERRNet that contains three main cooperative components, including Atrous Spatial Pyramid Pooling (ASPP) for initiating global prior, Selective Edge Aggregation (SEA) for generating edge prior, and Reversible Re-calibration Unit (RRU) for modulating and

实验结果

研究问题

  • RQ1基于生物启发的视觉感知机制是否可通过强调边缘与全局上下文信息,有效提升伪装目标检测性能?
  • RQ2如何在深层网络中有效聚合并保留边缘先验信息,以增强弱边界检测能力?
  • RQ3集成多种先验信息的可逆重校准单元是否能在COD任务中超越标准注意力或重校准模块?
  • RQ4所提出的框架是否具备跨多样化领域(包括自然场景COD与医学图像分割)的泛化能力?
  • RQ5该模型在COD与医学分割基准测试中,其推理速度与准确率相较于现有SOTA方法的超越程度如何?

主要发现

  • ERRNet在COD10K数据集上实现了0.867的平均E-measure,显著优于此前SOTA方法SINet约6个百分点。
  • 模型保持了79.3 FPS的高推理速度,展现出适用于实际部署的实时处理能力。
  • 在医学图像分割任务中,ERRNet在息肉与肺部感染分割数据集上均超越全部六种SOTA基线模型,尤其在COVID-19数据集上较Inf-Net实现12%的敏感度(Sen.)提升。
  • 消融实验证实SEA与RRU模块均不可或缺,其中RRU通过多先验重校准带来显著性能增益。
  • ERRNet在三个COD数据集上均位列第一,并在五个医学图像分割基准中持续优于现有模型。
  • 即使采用标准ResNet-50主干网络,ERRNet的性能仍超过使用更强Res2Net-50主干的Inf-Net,尤其在敏感度与E-measure指标上表现更优。
Figure 3 : Visualization of each component in the NEGS priors, i.e. , edge prior in (c), global prior in (d), and neighbour prior in (e) & (f). Specifically, the re-calibration stage treats the intermediate outputs of the network as the prior cues to enhance the reliability and stability of the lear
Figure 3 : Visualization of each component in the NEGS priors, i.e. , edge prior in (c), global prior in (d), and neighbour prior in (e) & (f). Specifically, the re-calibration stage treats the intermediate outputs of the network as the prior cues to enhance the reliability and stability of the lear

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。