QUICK REVIEW

[论文解读] BiSe-Unet: A Lightweight Dual-path U-Net with Attention-refined Context for Real-time Medical Image Segmentation

M. Aftab Hossain, Laura J. Brattain|arXiv (Cornell University)|Feb 22, 2026

Advanced Neural Network Applications被引用 0

一句话总结

BiSe-Unet 引入一个轻量级的双路径U-Net，具备注意力 refined 上下文和深度分离解码器，以在边缘设备上实现实时医疗分割， demonstrated on Kvasir-SEG.

ABSTRACT

During image-guided procedures, real-time image segmentation is often required. This demands lightweight AI models that can operate on resource-constrained devices. One important use case is endoscopy-guided colonoscopy, where polyps must be detected in real time. The Kvasir-Seg dataset, a publicly available benchmark for this task, contains 1,000 high-resolution endoscopic images of polyps with corresponding pixel-level segmentation masks. Achieving real-time inference speed for clinical deployment in constrained environments requires highly efficient and lightweight network architectures. However, many existing models remain too computationally intensive for embedded deployment. Lightweight architectures, although faster, often suffer from reduced spatial precision and weaker contextual understanding, leading to degraded boundary quality and reduced diagnostic reliability. To address these challenges, we introduce BiSe-UNet, a lightweight dual-path U-Net that integrates an attention-refined context path with a shallow spatial path for detailed feature preservation, followed by a depthwise separable decoder for efficient reconstruction. Evaluated on the Kvasir-Seg dataset, BiSe-UNet achieves competitive Dice and IoU scores while sustaining real-time throughput exceeding 30 FPS on Raspberry Pi 5, demonstrating its effectiveness for accurate, lightweight, and deployable medical image segmentation on edge hardware.

研究动机与目标

在资源受限的硬件上推动实时医疗图像分割。
开发一个能够保留边界细节和上下文的轻量级双路径结构。
高效融合空间特征与上下文特征，在不进行大量计算的前提下提升边界精度。
在边缘设备上展示实时性能（30+ FPS），同时保持具竞争力的精度。

提出的方法

提出一个带 Attention Refinement Context Path 的双路径架构和一个浅层 Spatial Path。
使用基于 DSConv 的解码器以降低 MACs 与参数量。
通过 1x1 投影和 DSConv 块对 CP 与 SP 特征进行融合以实现重建。
在多个 CP 尺度引入注意力 refinement 模块以细化全局上下文。
在 Kvasir-SEG 上以 Dice 与 IoU 指标进行评估，并在 CUDA GPU 与 Raspberry Pi 5 上测量 FPS。

实验结果

研究问题

RQ1BiSe-UNet 是否能够在嵌入式硬件上实现实时（30+ FPS）分割而不牺牲准确性？
RQ2与单路轻量模型相比，双路径的上下文-空间融合是否能改善边界质量以及 Dice/IoU 的整体表现？

主要发现

模型	参数量 (M)	MACs (G)	Dice	IoU	CUDA（GTX 1080 Ti）FPS	CUDA 内存（MB）	Raspberry Pi 5 FPS	Raspberry Pi 内存（MB）
U-Net（基线）	7.813	11.67	0.7900	0.7000	217.91	420	2.65	300
BiSeNet	2.533	1.07	0.7501	0.6595	397.37	210	30.06	160
HarDNet	3.809	4.46	0.7775	0.6959	232.62	360	7.17	200
BiSe-UNet（本研究）	2.509	0.97	0.7809	0.6961	358.34	240	30.48	170

BiSe-UNet 以 2.509M 参数和 0.97G MACs 实现 Dice 0.7809 与 IoU 0.6961。
在 CUDA（GTX 1080 Ti）上达到 358 FPS，内存 240 MB，而在 Raspberry Pi 5 上达到 30.48 FPS，内存 170 MB。
与 U-Net 相比，BiSe-UNet 将 MACs 降低超过 90%（0.97G vs 11.67G），同时保持有竞争力的 Dice（对比 0.7900）和 IoU（对比 0.7000）。
与 BiSeNet 相比，BiSe-UNet 在参数量相近的情况下将 Dice 提升 4.1 个百分点、IoU 提升 5.5 个百分点。
消融实验表明双路径融合和 /8 等级融合在准确性与速度之间实现最佳折衷。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。