QUICK REVIEW

[論文レビュー] Multi-View Deformable Convolution Meets Visual Mamba for Coronary Artery Segmentation

Xiaochan Yuan, Pai Zeng|arXiv (Cornell University)|Mar 23, 2026

Retinal Imaging and Analysis被引用数 0

ひとこと要約

本論文は MDSVM-UNet を提案し、Multidirectional Snake Convolution（MDSConv）と Residual Visual Mamba（RVM）を組み合わせた2段階の冠動脈セグメンテーションモデルを示す。ImageCAS で最先端の結果を達成。

ABSTRACT

Accurate segmentation of coronary arteries from computed tomography angiography (CTA) images is of paramount clinical importance for the diagnosis and treatment planning of cardiovascular diseases. However, coronary artery segmentation remains challenging due to the inherent multi-branching and slender tubular morphology of the vasculature, compounded by severe class imbalance between foreground vessels and background tissue. Conventional convolutional neural network (CNN)-based approaches struggle to capture long-range dependencies among spatially distant vascular structures, while Vision Transformer (ViT)-based methods incur prohibitive computational overhead that hinders deployment in resource-constrained clinical settings. Motivated by the recent success of state space models (SSMs) in efficiently modeling long-range sequential dependencies with linear complexity, we propose MDSVM-UNet, a novel two-stage coronary artery segmentation framework that synergistically integrates multidirectional snake convolution (MDSConv) with residual visual Mamba (RVM). In the encoding stage, we introduce MDSConv, a deformable convolution module that learns adaptive offsets along three orthogonal anatomical planes -- sagittal, coronal, and axial -- thereby enabling comprehensive multi-view feature fusion that faithfully captures the elongated and tortuous geometry of coronary vessels. In the decoding stage, we design an RVM-based upsampling decoder block that leverages selective state space mechanisms to model inter-slice long-range dependencies while preserving linear computational complexity. Furthermore, we propose a progressive two-stage segmentation strategy: the first stage performs coarse whole-image segmentation to guide intelligent block extraction, while the second stage conducts fine-grained block-level segmentation to recover vascular details and suppress false positives..

研究の動機と目的

CTA からの自動化・正確な冠動脈セグメンテーションを診断と治療計画の補助として動機づける。
管状・細長い血管形態と foreground-background の深刻な不均衡に対処する。
グローバルな文脈と局所的な血管細部のバランスをとる、粗い段階からの微細段階への2段階のフレームワークを提案する。

提案手法

解剖学的3軸平面（矢状断・冠状断・軸位断）に沿って変形可能畳み込みを行い、多視点特徴を統合して管状構造を捉える MDSConv を導入する。
RVM デコーダを用いて断層間の長距離依存性を線形計算量でモデル化する。
グローバルな全画像セグメンテーションを用いた粗いブロック抽出を導入し、次にブロックレベルでの微細なセグメンテーションを行う2段階の逐次セグメンテーションを採用する。
UNet++風の密なスキップ接続を採用し、多段階の特徴伝播を豊かにする。
血管と背景の間の深刻なクラス不均衡を扱うため Dice loss を用いて訓練する。

実験結果

リサーチクエスチョン

RQ1多方向の変形可能畳み込みは、標準CNNよりも長く曲がりくねる冠状血管をより良く捉えられるか？
RQ2 Residual Visual Mamba デコーダを組み込むと、直線的な計算量で長距離依存性のモデリングとセグメンテーション品質が向上するか？
RQ32段階の粗い-to細かな戦略は、単段方法と比較して血管の連続性を改善し偽陽性を減らせるか？
RQ4ImageCAS ベンチマークにおける MDSVM-UNet は DSC・HD・AHD の観点で最先端の方法と比較してどうか？

主な発見

Method	Params (M)	Loss	DSC ⬆	HD ⬇	AHD ⬇
MDSVM-UNet (Ours)	26.7	L Dice	0.6860	27.8430	0.9023
MDSVM-UNet (Ours)	26.7	L Focal	0.6814	27.5249	0.9260
ImageCAS Zeng et al. (2023)	27.6	L Dice	0.6600	29.1486	0.9129
Two-Stage Two-Stage MDSVM-UNet (Ours)	26.7	L Dice	0.8365	27.8430	0.9023
Two-Stage Two-Stage MDSVM-UNet (Ours)	26.7	L Focal	0.8210	27.5249	0.9199

MDSVM-UNet は ImageCAS で単段が 0.686 DSC、2段階構成で 0.8365 DSC を達成（Stage 2 Dice loss を用いる）。
2段階の MDSVM-UNet は ImageCAS のベースラインより DSCで 5.41%、HDで 8.5456、AHDで 0.8093 向上。
Stage 1（単段）で Dice loss により DSC 0.6860、HD 27.8430、AHD 0.9023（表1）。
Two-stage MDSVM-UNet with Dice loss は DSC 0.8365、HD 27.8430、AHD 0.9023（表2）。
本モデルはパラメータ数 26.7M で、トランスフォーマー系アプローチと比較して競争力のある効率を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。