QUICK REVIEW

[论文解读] UKAN-EP: Enhancing U-KAN with Efficient Attention and Pyramid Aggregation for 3D Multi-Modal MRI Brain Tumor Segmentation

Yanbing Chen, Tianze Tang|arXiv (Cornell University)|Aug 1, 2024

Medical Image Segmentation Techniques被引用 6

一句话总结

本文将 U-KAN 适配到 3D 多模态 MRI 脑肿瘤分割，并引入 UKAN-SE（带 SE 注意力的 U-KAN），在 BraTS 2024 上与 U-Net、Attention U-Net 和 Swin UNETR 进行比较，显示出高精度、显著降低的训练时间以及约 10.6M 参数量。

ABSTRACT

Background: Gliomas are among the most common malignant brain tumors and exhibit substantial heterogeneity, complicating accurate detection and segmentation. Although multi-modal MRI is the clinical standard for glioma imaging, variability across modalities and high computational demands hamper effective automated segmentation. Methods: We propose UKAN-EP, a novel 3D extension of the original 2D U-KAN model for multi-modal MRI brain tumor segmentation. While U-KAN integrates Kolmogorov-Arnold Network (KAN) layers into a U-Net backbone, UKAN-EP further incorporates Efficient Channel Attention (ECA) and Pyramid Feature Aggregation (PFA) modules to enhance inter-modality feature fusion and multi-scale feature representation. We also introduce a dynamic loss weighting strategy that adaptively balances cross-entropy and Dice losses during training. Results: On the 2024 BraTS-GLI dataset, UKAN-EP achieves superior segmentation performance (e.g., Dice = 0.9001 $\pm$ 0.0127 and IoU = 0.8257 $\pm$ 0.0186 for the whole tumor) while requiring substantially fewer computational resources (223.57 GFLOPs and 11.30M parameters) compared to strong baselines including U-Net, Attention U-Net, Swin UNETR, VT-Unet, TransBTS, and 3D U-KAN. An extensive ablation study further confirms the effectiveness of ECA and PFA and shows the limited utility of self-attention and spatial attention alternatives. Conclusion: UKAN-EP demonstrates that combining the expressive power of KAN layers with lightweight channel-wise attention and multi-scale feature aggregation improves the accuracy and efficiency of brain tumor segmentation.

研究动机与目标

推动在多模态 MRI 脑肿瘤分割的 3D U-Net 框架中引入 Kolmogorov-Arnold Networks (KAN)。
将二维 U-KAN 模型改造为 3D，并通过在全局注意力中引入 Squeeze-and-Excitation (SE) 模块来提出 UKAN-SE。
在 BraTS 2024 Task 1 数据集上对 U-KAN 与 UKAN-SE 与 U-Net、Attention U-Net、Swin UNETR 进行评估。
突出参数数量与训练时间的效率，同时保持或提升分割性能。

提出的方法

使用一个 3D 版本的 U-KAN，包含卷积阶段后跟 Tokenized KAN (Tok-KAN) 阶段。
引入带可学习激活函数的 KAN 层以改进模式建模。
引入 UKAN-SE，在每个卷积块后加入 SE 模块以实现全局注意力。
使用 BraTS 2024 Task 1 的四种 MRI 模态（T1, T1Gd, T2, FLAIR），对五个模型（U-Net、Attention U-Net、Swin UNETR、U-KAN、UKAN-SE）进行训练与评估。
损失函数结合交叉熵和 Soft Dice 损失：L_total = (1-α) L_CE + α(1 - SoftDice) 其中 α = 0.5。

实验结果

研究问题

RQ1相较于既有基线（U-Net、Attention U-Net、Swin UNETR），3D U-KAN 与 UKAN-SE 在 BraTS 2024 Task 1 上的表现如何？
RQ2相对传统的基于 U-Net 的模型，使用 U-KAN/UKAN-SE 在参数量和训练时间上有哪些效率提升？
RQ3引入 SE 注意力（UKAN-SE）是否在分割准确性和边界界定方面提供了可观的提升？
RQ4中等规模的 KAN 配置（[128,160,256]）对 3D 多模态 MRI 脑肿瘤分割有什么影响？

主要发现

UKAN-SE 在大多数肿瘤子区域，尤其是 ET 和 RC，在病变级别和全图 Dice 分数上均取得最好成绩。
UKAN-SE 通常优于 U-KAN，表明基于 SE 的全局注意力带来的额外价值。
U-KAN 和 UKAN-SE 以约 10.6M 参数量呈现强劲性能，且相比 U-Net、Attention U-Net 和 Swin UNETR 显著缩短训练时间。
U-KAN 在每轮训练时间上显著更短（约 803 秒）而非 U-Net（约 3322 秒），仍具竞争力。
UKAN-SE 相对于 U-KAN 在参数与训练时间的提升幅度较小的同时，提供了更高的分割精度。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。