Skip to main content
QUICK REVIEW

[论文解读] W-DUALMINE: Reliability-Weighted Dual-Expert Fusion With Residual Correlation Preservation for Medical Image Fusion

Md. Jahidul Islam|arXiv (Cornell University)|Jan 13, 2026
Advanced Image Fusion Techniques被引用 0
一句话总结

W-DUALMINE 使用可靠性加权的双专家和残差到均值融合,在 CT–MRI、PET–MRI、SPECT–MRI 数据集上保持全局统计量(MI/CC)并增强局部细节,优于 AdaFuse 与 ASFE-Fusion。

ABSTRACT

Medical image fusion integrates complementary information from multiple imaging modalities to improve clinical interpretation. However, existing deep learningbased methods, including recent spatial-frequency frameworks such as AdaFuse and ASFE-Fusion, often suffer from a fundamental trade-off between global statistical similaritymeasured by correlation coefficient (CC) and mutual information (MI)and local structural fidelity. This paper proposes W-DUALMINE, a reliability-weighted dual-expert fusion framework designed to explicitly resolve this trade-off through architectural constraints and a theoretically grounded loss design. The proposed method introduces dense reliability maps for adaptive modality weighting, a dual-expert fusion strategy combining a global-context spatial expert and a wavelet-domain frequency expert, and a soft gradient-based arbitration mechanism. Furthermore, we employ a residual-to-average fusion paradigm that guarantees the preservation of global correlation while enhancing local details. Extensive experiments on CT-MRI, PET-MRI, and SPECT-MRI datasets demonstrate that W-DUALMINE consistently outperforms AdaFuse and ASFE-Fusion in CC and MI metrics while

研究动机与目标

  • Address the trade-off between global statistical similarity and local structural fidelity in medical image fusion.
  • Introduce dense reliability maps to suppress artifacts from unreliable regions before fusion.
  • Develop a dual-expert fusion architecture (spatial and wavelet frequency) with a soft gradient arbitration mechanism.
  • Adopt a residual-to-average fusion paradigm to theoretically preserve high CC and MI with source inputs.

提出的方法

  • Siamese multi-scale encoders extract hierarchical features from each modality.
  • Dense reliability maps predict pixel-wise reliability scores to weight feature fusion adaptively.
  • Dual-expert fusion at each scale: a Global Context Spatial Expert and a Wavelet Frequency Expert.
  • Soft Gradient Mixer dynamically arbiters between spatial and wavelet outputs based on edge strength.
  • Residual-to-Average Decoder reconstructs a fused image by adding a residual to the mean of inputs, ensuring global statistical preservation.
  • Compound loss with five terms (L_avg, L_grad, L_cc, L_mi, L_rec) balances content fidelity, edge preservation, correlation, information, and reconstruction.
Figure 1: Architecture of the Reliability-Weighted Dual-Expert Fusion Network. The framework processes multi-modal inputs (e.g., CT/PET and MRI) through a Siamese encoder composed of ResBlocks. Feature maps are weighted by a Reliability Estimation module before entering two parallel expert branches:
Figure 1: Architecture of the Reliability-Weighted Dual-Expert Fusion Network. The framework processes multi-modal inputs (e.g., CT/PET and MRI) through a Siamese encoder composed of ResBlocks. Feature maps are weighted by a Reliability Estimation module before entering two parallel expert branches:

实验结果

研究问题

  • RQ1Can reliability-weighted feature modeling suppress artifacts from unreliable regions and improve fusion quality?
  • RQ2Do dual-expert (spatial and wavelet) fusion pathways with a soft gradient arbitration preserve global statistics while enhancing local details?
  • RQ3Does the residual-to-average fusion scheme guarantee high Mutual Information and Correlation Coefficient with the source modalities?

主要发现

MethodENMICCPSNRFMI
AdaFuse (CT–MRI)5.0592±0.23463.3570±0.19780.8306±0.023864.0004±0.77570.4343±0.0170
ASFE-Fusion (CT–MRI)5.4855±0.27343.1463±0.16050.8302±0.023863.9884±0.78450.4066±0.0180
W-DUALMINE (CT–MRI)4.3394±0.25023.6059±0.24190.8308±0.023864.0891±0.79170.4746±0.0210
  • On CT–MRI, W-DUALMINE achieves MI = 3.6059 and CC = 0.8308, outperforming competitors in global statistical similarity.
  • W-DUALMINE yields PSNR = 64.0891 and FMI = 0.4746 on CT–MRI, indicating strong edge preservation and feature fidelity.
  • On PET–MRI, W-DUALMINE attains MI = 4.3068 and FMI = 0.5064, with CC = 0.8686, demonstrating improved functional information transfer and texture preservation.
  • On SPECT–MRI, W-DUALMINE records MI = 4.0016, CC = 0.9116, and PSNR = 64.9084, highlighting robust performance under resolution disparity and noise。”,
Figure 2: The overall framework of W-DUALMINE. The architecture consists of Siamese encoders extracting multi-scale features, which are fused and projected for contrastive learning. The network is optimized via a composite loss function comprising: (1) Average Content Loss ( $\mathcal{L}_{avg}$ ) fo
Figure 2: The overall framework of W-DUALMINE. The architecture consists of Siamese encoders extracting multi-scale features, which are fused and projected for contrastive learning. The network is optimized via a composite loss function comprising: (1) Average Content Loss ( $\mathcal{L}_{avg}$ ) fo

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。