Skip to main content
QUICK REVIEW

[论文解读] Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

Bharadwaj Dogga, Kaaustaaub Shankar|arXiv (Cornell University)|Feb 4, 2026
Explainable Artificial Intelligence (XAI)被引用 0
一句话总结

该论文提出了 sMoE U-Net,一种混合可解释边缘检测器,使用 Spatially-Adaptive MoE 块与 Takagi-Sugeno-Kang 模糊头实现像素级可解释性的竞争性准确性。

ABSTRACT

Deep learning models like U-Net and its variants, have established state-of-the-art performance in edge detection tasks and are used by Generative AI services world-wide for their image generation models. However, their decision-making processes remain opaque, operating as "black boxes" that obscure the rationale behind specific boundary predictions. This lack of transparency is a critical barrier in safety-critical applications where verification is mandatory. To bridge the gap between high-performance deep learning and interpretable logic, we propose the Rule-Based Spatial Mixture-of-Experts U-Net (sMoE U-Net). Our architecture introduces two key innovations: (1) Spatially-Adaptive Mixture-of-Experts (sMoE) blocks integrated into the decoder skip connections, which dynamically gate between "Context" (smooth) and "Boundary" (sharp) experts based on local feature statistics; and (2) a Takagi-Sugeno-Kang (TSK) Fuzzy Head that replaces the standard classification layer. This fuzzy head fuses deep semantic features with heuristic edge signals using explicit IF-THEN rules. We evaluate our method on the BSDS500 benchmark, achieving an Optimal Dataset Scale (ODS) F-score of 0.7628, effectively matching purely deep baselines like HED (0.7688) while outperforming the standard U-Net (0.7437). Crucially, our model provides pixel-level explainability through "Rule Firing Maps" and "Strategy Maps," allowing users to visualize whether an edge was detected due to strong gradients, high semantic confidence, or specific logical rule combinations.

研究动机与目标

  • 通过将可解释的模糊逻辑与高性能 CNN 骨干集成,在边缘检测中实现精度与可解释性的桥接。
  • 通过在上下文与边界处理之间的空间自适应门控,保持边缘清晰度同时抑制纹理噪声。
  • 通过规则触发图和策略图来提供像素级解释,以可视化决策理由。

提出的方法

  • 在 U-Net 解码器的跳跃连接处引入 Spatially-Adaptive Mixture-of-Experts (sMoE) 块,在平滑(上下文)和锐化(边界)专家之间通过由 Sobel 边缘图驱动的门控网络进行门控。
  • 用 First-Order Takagi-Sugeno-Kang (TSK) 模糊头替换标准分类器,通过 4 条可学习模糊规则融合边缘强度和语义置信度。
  • 使用可微分的、基于高斯的规则触发机制,将最终边缘图作为规则结果的加权平均。
  • 通过将 Binary Cross Entropy 与 Dice 损失结合的组合损失进行训练;通过均方误差(MSE)将模糊头蒸馏以模拟主 logits。
  • 提供可解释性可视化:策略图和规则触发图,展示逐像素的决策过程。
Figure 1 : Compact architecture of the proposed explainable sMoE U-Net with Sobel pre-processing and a TSK fuzzy head.
Figure 1 : Compact architecture of the proposed explainable sMoE U-Net with Sobel pre-processing and a TSK fuzzy head.

实验结果

研究问题

  • RQ1混合的 sMoE-U-Net 是否能够在实现可解释性的同时维持或超过最先进的边缘检测性能?
  • RQ2空间自适应门控和模糊头如何促进边界描绘以及减少假阳性?
  • RQ3哪些形式的可视化解释(策略图、规则触发图)能揭示模型对边缘的决策逻辑?

主要发现

  • sMoE U-Net 在 BSDS500 上达到 ODS F-score 0.7628,接近 HED(0.7688)并显著高于标准 U-Net(0.7437)。
  • 模型同样达到 OIS 0.7458 和 AP 0.7222,在 AP 指标上超越 U-Net 与 HED。
  • sMoE 门控在固定召回率下实现更高的精度,见于精度-召回曲线。
  • 定性分析表明策略图在边缘处突出边界专家激活,在同质区域则显式地显示上下文专家激活。
  • 规则触发图揭示控制强边缘、较强边缘和嘈杂边缘的不同 IF-THEN 规则,提升可解释性。
Figure 2 : Architecture of Spatially-Adaptive Mixture-of-Experts with Sobel Edge signal
Figure 2 : Architecture of Spatially-Adaptive Mixture-of-Experts with Sobel Edge signal

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。