QUICK REVIEW

[論文レビュー] BiSe-Unet: A Lightweight Dual-path U-Net with Attention-refined Context for Real-time Medical Image Segmentation

M. Aftab Hossain, Laura J. Brattain|arXiv (Cornell University)|Feb 22, 2026

Advanced Neural Network Applications被引用数 0

ひとこと要約

BiSe-Unetは、注意機構で強化されたコンテキストと深度方向分離型デコーダを備えた軽量なデュアルパス U-Net を導入し、エッジデバイス上でのリアルタイム医療セグメンテーションを実現。Kvasir-SEGで実演。

ABSTRACT

During image-guided procedures, real-time image segmentation is often required. This demands lightweight AI models that can operate on resource-constrained devices. One important use case is endoscopy-guided colonoscopy, where polyps must be detected in real time. The Kvasir-Seg dataset, a publicly available benchmark for this task, contains 1,000 high-resolution endoscopic images of polyps with corresponding pixel-level segmentation masks. Achieving real-time inference speed for clinical deployment in constrained environments requires highly efficient and lightweight network architectures. However, many existing models remain too computationally intensive for embedded deployment. Lightweight architectures, although faster, often suffer from reduced spatial precision and weaker contextual understanding, leading to degraded boundary quality and reduced diagnostic reliability. To address these challenges, we introduce BiSe-UNet, a lightweight dual-path U-Net that integrates an attention-refined context path with a shallow spatial path for detailed feature preservation, followed by a depthwise separable decoder for efficient reconstruction. Evaluated on the Kvasir-Seg dataset, BiSe-UNet achieves competitive Dice and IoU scores while sustaining real-time throughput exceeding 30 FPS on Raspberry Pi 5, demonstrating its effectiveness for accurate, lightweight, and deployable medical image segmentation on edge hardware.

研究の動機と目的

リソース制約のあるハードウェア上でのリアルタイム医療画像セグメンテーションを動機付ける。
境界細部と文脈を保持する軽量デュアルパスアーキテクチャを開発する。
重い計算を伴わずに、空間的特徴と文脈特徴を効率的に融合して境界精度を向上させる。
エッジデバイス上で競争力のある精度を維持しつつ、リアルタイム性能（30 FPS 以上）を実現する。

提案手法

Attention Refinement Context Path（注意機能強化コンテキストパス）と浅い Spatial Path（空間パス）を持つデュアルパスアーキテクチャを提案する。
MACとパラメータを削減するDSConvベースのデコーダを使用する。
1x1 投影とDSConv ブロックを介してCPとSP特徴をフュージョンして再構成を行う。
グローバルコンテキストを精緻化するため、複数のCPスケールで注意機能強化モジュールを組み込む。
Kvasir-SEGでDiceとIoU指標を用いて評価し、CUDA GPUと Raspberry Pi 5 での FPS を測定する。

実験結果

リサーチクエスチョン

RQ1BiSe-UNetは埋め込みハードウェア上でリアルタイム（30 FPS 以上）セグメンテーションを失われずに実現できるか？
RQ2デュアルパスの文脈-空間フュージョンは単一パスの軽量モデルと比較して境界品質と全体の Dice/IoU を向上させるか？

主な発見

Model	Params (M)	MACs (G)	Dice	IoU	CUDA (GTX 1080 Ti) FPS	CUDA Memory (MB)	Raspberry Pi 5 FPS	Raspberry Pi Memory (MB)
U-Net (baseline)	7.813	11.67	0.7900	0.7000	217.91	420	2.65	300
BiSeNet	2.533	1.07	0.7501	0.6595	397.37	210	30.06	160
HarDNet	3.809	4.46	0.7775	0.6959	232.62	360	7.17	200
BiSe-UNet (Ours)	2.509	0.97	0.7809	0.6961	358.34	240	30.48	170

BiSe-UNet は Dice 0.7809 と IoU 0.6961、パラメータ 2.509M、MACs 0.97G。
CUDA（GTX 1080 Ti）で FPS 358、RAM 240 MB、Raspberry Pi 5 で FPS 30.48、RAM 170 MB。
BiSe-UNet は U-Net（0.97G vs 11.67G）と比較して MACs を90％超削減しつつ、Dice（0.7900 に対して）および IoU（0.7000 に対して）で競争力を維持。
BiSeNet と比較して、BiSe-UNet は Dice を 4.1 ポイント、IoU を 5.5 ポイント改善し、同程度のパラメータ数。
アブレーションにより、デュアルパスフュージョンと /8 スケールフュージョンが精度と速度の最適なトレードオフを生み出すことが示された。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。