QUICK REVIEW

[논문 리뷰] Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection

Bharadwaj Dogga, Kaaustaaub Shankar|arXiv (Cornell University)|2026. 02. 04.

Explainable Artificial Intelligence (XAI)인용 수 0

한 줄 요약

논문은 sMoE U-Net를 도입하는데, Spatially-Adaptive MoE 블록과 Takagi-Sugeno-Kang 퍼지 헤드를 활용하여 픽셀 수준의 해석 가능성과 함께 경쟁력 있는 정확도를 달성하는 하이브리드 설명 가능한 엣지 검출기를 제시한다.

ABSTRACT

Deep learning models like U-Net and its variants, have established state-of-the-art performance in edge detection tasks and are used by Generative AI services world-wide for their image generation models. However, their decision-making processes remain opaque, operating as "black boxes" that obscure the rationale behind specific boundary predictions. This lack of transparency is a critical barrier in safety-critical applications where verification is mandatory. To bridge the gap between high-performance deep learning and interpretable logic, we propose the Rule-Based Spatial Mixture-of-Experts U-Net (sMoE U-Net). Our architecture introduces two key innovations: (1) Spatially-Adaptive Mixture-of-Experts (sMoE) blocks integrated into the decoder skip connections, which dynamically gate between "Context" (smooth) and "Boundary" (sharp) experts based on local feature statistics; and (2) a Takagi-Sugeno-Kang (TSK) Fuzzy Head that replaces the standard classification layer. This fuzzy head fuses deep semantic features with heuristic edge signals using explicit IF-THEN rules. We evaluate our method on the BSDS500 benchmark, achieving an Optimal Dataset Scale (ODS) F-score of 0.7628, effectively matching purely deep baselines like HED (0.7688) while outperforming the standard U-Net (0.7437). Crucially, our model provides pixel-level explainability through "Rule Firing Maps" and "Strategy Maps," allowing users to visualize whether an edge was detected due to strong gradients, high semantic confidence, or specific logical rule combinations.

연구 동기 및 목표

엣지 검출의 정확도와 해석 가능성을 연결하기 위해 Explainable fuzzy logic과 고성능 CNN 백본을 통합한다.
맥락(Context)와 경계 처리 사이의 공간적으로 적응적인 게이팅을 통해 텍스처 잡음을 억제하면서 엣지 선명도를 보존한다.
픽셀 수준의 설명을 시각화하기 위해 규칙 발화 맵과 전략 맵을 제공한다.

제안 방법

Sob e l edge map에 의해 구동되는 게이팅 네트워크를 이용하여 Smooth (context)와 Sharp (boundary) 전문가 사이를 게이트하는 Spatially-Adaptive Mixture-of-Experts (sMoE) 블록을 U-Net 디코더 스킵 연결에 도입한다.
표준 분류기를 First-Order Takagi-Sugeno-Kang (TSK) 퍼지 헤드로 대체하여 Edge Strength와 Semantic Confidence를 4개의 학습 가능 퍼지 규칙을 통해 융합한다.
미분 가능하고 Gaussian 기반의 규칙 발화 메커니즘을 사용하여 규칙의 결과를 가중 평균한 최종 엣지 맵을 계산한다.
Binary Cross Entropy와 Dice 손실을 결합한 합성 손실로 학습한다; MSE를 통해 퍼지 헤드를 주요 로짓을 모방하도록 증류한다.
전 pixel 단위로 의사결정을 시각화하는 Strategy Maps와 Rule Firing Maps 같은 설명 가능성 시각화를 제공한다.

Figure 1 : Compact architecture of the proposed explainable sMoE U-Net with Sobel pre-processing and a TSK fuzzy head.

실험 결과

연구 질문

RQ1하이브리드 sMoE-U-Net가 설명 가능성을 가능하게 하면서 최첨단 엣지 검출 성능을 유지하거나 능가할 수 있는가?
RQ2공간적으로 적응적인 게이팅과 퍼지 헤드가 엣지 구분 및 거짓 양성 감소에 어떻게 기여하는가?
RQ3전략 맵, 규칙 발화 맵 같은 시각적 설명은 모델의 엣지에 대한 의사결정 로직을 어떤 형태로 보여주는가?

주요 결과

방법	OIS	ODS	AP
U-Net	0.7260	0.7437	0.6946
sMoE U-Net (Ours)	0.7458	0.7628	0.7222
Canny	0.4836	0.5450	0.4125
Sobel	0.5303	0.5769	0.4743
HED	0.7514	0.7688	0.7126

sMoE U-Net은 BSDS500에서 ODS F-score가 0.7628로 HED(0.7688)에 근접하고 표준 U-Net(0.7437)을 능가한다.
모델은 또한 OIS 0.7458과 AP 0.7222를 달성하여 AP에서 U-Net과 HED를 능가한다.
sMoE 게이팅은 고정된 재현율에서 더 높은 정밀도를 보인다, 정밀도-재현율 곡선으로 나타난다.
정성적 분석은 Strategy Maps가 경계에 대한 Boundary Expert 활성화를 강조하고 homogeneous regions에서 Context Expert 활성화를 보인다.
Rule Firing Maps는 강한 엣지 vs. 약한/노이즈 엣지를 지배하는 명확한 IF-THEN 규칙을 보여주어 해석 가능성을 가능하게 한다.

Figure 2 : Architecture of Spatially-Adaptive Mixture-of-Experts with Sobel Edge signal

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.