[論文レビュー] Physics-Driven Autoregressive State Space Models for Medical Image Reconstruction
本論文は MambaRoll を紹介する。物理主導の自己回帰状態空間モデルで、 undersampled データから医用画像を再構成し、自己回帰的な次スケール予測を用いて多段階の文脈特徴を逐次統合する。 MRI およびスパースビュー CT において、畳み込み、トランスフォーマー、および従来の SSM PD 手法を上回る。
Medical image reconstruction from undersampled acquisitions is an ill-posed inverse problem requiring accurate recovery of anatomical structures from incomplete measurements. Physics-driven (PD) network models have gained prominence for this task by integrating data-consistency mechanisms with learned priors, enabling improved performance over purely data-driven approaches. However, reconstruction quality still hinges on the network's ability to disentangle artifacts from true anatomical signals-both of which exhibit complex, multi-scale contextual structure. Convolutional neural networks (CNNs) capture local correlations but often struggle with non-local dependencies. While transformers aim to alleviate this limitation, practical implementations involve design compromises to reduce computational cost by balancing local and non-local sensitivity, occasionally resulting in performance comparable to CNNs. To address these challenges, we propose MambaRoll, a novel physics-driven autoregressive state space model (SSM) for high-fidelity and efficient image reconstruction. MambaRoll employs an unrolled architecture where each cascade autoregressively predicts finer-scale feature maps conditioned on coarser-scale representations, enabling consistent multi-scale context propagation. Each stage is built on a hierarchy of scale-specific PD-SSM modules that capture spatial dependencies while enforcing data consistency through residual correction. To further improve scale-aware learning, we introduce a Deep Multi-Scale Decoding (DMSD) loss, which provides supervision at intermediate spatial scales in alignment with the autoregressive design. Demonstrations on accelerated MRI and sparse-view CT reconstructions show that MambaRoll consistently outperforms state-of-the-art CNN-, transformer-, and SSM-based methods.
研究の動機と目的
- Motivate improved reconstruction from undersampled measurements by leveraging physics-driven priors.
- Develop a multi-scale autoregressive framework that fuses context across spatial scales while enforcing data fidelity.
- Introduce novel physics-driven state-space modules (PSSMs) that operate across scales within an unrolled architecture.
- Demonstrate superior reconstruction quality on accelerated MRI and sparse-view CT datasets compared to existing PD methods.
提案手法
- Propose MambaRoll, an unrolled PD architecture with K cascades that progressively reconstructs high-resolution feature maps across S spatial scales.
- Within each cascade, employ PSSM modules that include an encoder, a shuffled SSM, a decoder, and a residual data-consistency block to enforce fidelity to the imaging operator.
- Use autoregressive prediction across scales by feeding concatenated features from earlier scales to process subsequent scales.
- Train with a multi-term objective that includes the cascade output error and scale-specific decoded feature errors to encourage faithful scale-wise reconstructions.
- Evaluate on accelerated MRI (IXI, fastMRI) and sparse-view CT (LoDoPaB-CT) against PD-CNN, PD-TransUNet, PD-UNetMHA, PD-UNetMamba, and PD-UMamba.
実験結果
リサーチクエスチョン
- RQ1Can a physics-driven autoregressive SSM framework improve reconstruction fidelity for undersampled medical imaging tasks?
- RQ2Do multi-scale PSSMs with autoregressive scale predictions better capture long-range contextual information while maintaining data consistency?
- RQ3How does MambaRoll compare to convolutional, transformer, and conventional SSM PD approaches in MRI and CT reconstructions?
- RQ4What is the contribution of PSSM, autoregression, and data-consistency blocks to overall performance?
主な発見
- MambaRoll consistently outperforms competing methods across MRI and CT tasks at R=4 and R=8 (e.g., MRI IXI/fastMRI; CT LoDoPaB-CT).
- Across evaluated tasks, MambaRoll yields substantial PSNR/SSIM gains over PD-CNN, PD-TransUNet, PD-UNetMHA, PD-UNetMamba, and PD-UMamba.
- Ablation studies show that removing PSSM, autoregression, or data-consistency blocks degrades reconstruction performance, highlighting the importance of each component.
- In MRI, MambaRoll achieves higher PSNR/SSIM and better tissue delineation with reduced artifacts and noise compared to baselines.
- In sparse-view CT, MambaRoll achieves larger PSNR/SSIM gains relative to baselines, indicating robust performance across modalities.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。