QUICK REVIEW

[论文解读] W-DUALMINE: Reliability-Weighted Dual-Expert Fusion With Residual Correlation Preservation for Medical Image Fusion

Md. Jahidul Islam|arXiv (Cornell University)|Jan 13, 2026

Advanced Image Fusion Techniques被引用 0

一句话总结

W-DUALMINE 使用可靠性加权的双专家和残差到均值融合，在 CT–MRI、PET–MRI、SPECT–MRI 数据集上保持全局统计量（MI/CC）并增强局部细节，优于 AdaFuse 与 ASFE-Fusion。

ABSTRACT

Medical image fusion integrates complementary information from multiple imaging modalities to improve clinical interpretation. However, existing deep learningbased methods, including recent spatial-frequency frameworks such as AdaFuse and ASFE-Fusion, often suffer from a fundamental trade-off between global statistical similaritymeasured by correlation coefficient (CC) and mutual information (MI)and local structural fidelity. This paper proposes W-DUALMINE, a reliability-weighted dual-expert fusion framework designed to explicitly resolve this trade-off through architectural constraints and a theoretically grounded loss design. The proposed method introduces dense reliability maps for adaptive modality weighting, a dual-expert fusion strategy combining a global-context spatial expert and a wavelet-domain frequency expert, and a soft gradient-based arbitration mechanism. Furthermore, we employ a residual-to-average fusion paradigm that guarantees the preservation of global correlation while enhancing local details. Extensive experiments on CT-MRI, PET-MRI, and SPECT-MRI datasets demonstrate that W-DUALMINE consistently outperforms AdaFuse and ASFE-Fusion in CC and MI metrics while

研究动机与目标

Address the trade-off between global statistical similarity and local structural fidelity in medical image fusion.
Introduce dense reliability maps to suppress artifacts from unreliable regions before fusion.
Develop a dual-expert fusion architecture (spatial and wavelet frequency) with a soft gradient arbitration mechanism.
Adopt a residual-to-average fusion paradigm to theoretically preserve high CC and MI with source inputs.

提出的方法

Siamese multi-scale encoders extract hierarchical features from each modality.
Dense reliability maps predict pixel-wise reliability scores to weight feature fusion adaptively.
Dual-expert fusion at each scale: a Global Context Spatial Expert and a Wavelet Frequency Expert.
Soft Gradient Mixer dynamically arbiters between spatial and wavelet outputs based on edge strength.
Residual-to-Average Decoder reconstructs a fused image by adding a residual to the mean of inputs, ensuring global statistical preservation.
Compound loss with five terms (L_avg, L_grad, L_cc, L_mi, L_rec) balances content fidelity, edge preservation, correlation, information, and reconstruction.

Figure 1: Architecture of the Reliability-Weighted Dual-Expert Fusion Network. The framework processes multi-modal inputs (e.g., CT/PET and MRI) through a Siamese encoder composed of ResBlocks. Feature maps are weighted by a Reliability Estimation module before entering two parallel expert branches:

实验结果

研究问题

RQ1Can reliability-weighted feature modeling suppress artifacts from unreliable regions and improve fusion quality?
RQ2Do dual-expert (spatial and wavelet) fusion pathways with a soft gradient arbitration preserve global statistics while enhancing local details?
RQ3Does the residual-to-average fusion scheme guarantee high Mutual Information and Correlation Coefficient with the source modalities?

主要发现

Method	EN	MI	CC	PSNR	FMI
AdaFuse (CT–MRI)	5.0592±0.2346	3.3570±0.1978	0.8306±0.0238	64.0004±0.7757	0.4343±0.0170
ASFE-Fusion (CT–MRI)	5.4855±0.2734	3.1463±0.1605	0.8302±0.0238	63.9884±0.7845	0.4066±0.0180
W-DUALMINE (CT–MRI)	4.3394±0.2502	3.6059±0.2419	0.8308±0.0238	64.0891±0.7917	0.4746±0.0210

On CT–MRI, W-DUALMINE achieves MI = 3.6059 and CC = 0.8308, outperforming competitors in global statistical similarity.
W-DUALMINE yields PSNR = 64.0891 and FMI = 0.4746 on CT–MRI, indicating strong edge preservation and feature fidelity.
On PET–MRI, W-DUALMINE attains MI = 4.3068 and FMI = 0.5064, with CC = 0.8686, demonstrating improved functional information transfer and texture preservation.
On SPECT–MRI, W-DUALMINE records MI = 4.0016, CC = 0.9116, and PSNR = 64.9084, highlighting robust performance under resolution disparity and noise。”,

Figure 2: The overall framework of W-DUALMINE. The architecture consists of Siamese encoders extracting multi-scale features, which are fused and projected for contrastive learning. The network is optimized via a composite loss function comprising: (1) Average Content Loss ( $\mathcal{L}_{avg}$ ) fo

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。