Skip to main content
QUICK REVIEW

[论文解读] T-Mamba: A unified framework with Long-Range Dependency in dual-domain for 2D & 3D Tooth Segmentation

Jing Hao, Zhu, Yonghui|arXiv (Cornell University)|Apr 1, 2024
Dental Radiography and Imaging被引用 8
一句话总结

T-Mamba introduces the Tooth Vision Mamba (Tim) block integrated with DenseVNet to model global and local context for Tooth CBCT segmentation, achieving state-of-the-art results on public tooth CBCT data.

ABSTRACT

Tooth segmentation is a pivotal step in modern digital dentistry, essential for applications across orthodontic diagnosis and treatment planning. Despite its importance, this process is fraught with challenges due to the high noise and low contrast inherent in 2D and 3D tooth data. Both Convolutional Neural Networks (CNNs) and Transformers has shown promise in medical image segmentation, yet each method has limitations in handling long-range dependencies and computational complexity. To address this issue, this paper introduces T-Mamba, integrating frequency-based features and shared bi-positional encoding into vision mamba to address limitations in efficient global feature modeling. Besides, we design a gate selection unit to integrate two features in spatial domain and one feature in frequency domain adaptively. T-Mamba is the first work to introduce frequency-based features into vision mamba, and its flexibility allows it to process both 2D and 3D tooth data without the need for separate modules. Also, the TED3, a large-scale public tooth 2D dental X-ray dataset, has been presented in this paper. Extensive experiments demonstrate that T-Mamba achieves new SOTA results on a public tooth CBCT dataset and outperforms previous SOTA methods on TED3 dataset. The code and models are publicly available at: https://github.com/isbrycee/T-Mamba.

研究动机与目标

  • 在噪声和伪影下推动 3D CBCT 中的准确牙齿分割。
  • 开发一个在建模长程依赖的同时保持空间位置信息的框架。
  • 引入频域特征以增强医学影像中的鲁棒特征表示。
  • 提出基于门控的融合机制以自适应地组合空间特征和频域特征。

提出的方法

  • 在 vision Mamba 基础上扩展 Tim 块,将 2D/3D 特征处理为 1-D 序列。
  • 使用共享的双位置编码,在重塑过程中保持空间信息。
  • 通过傅里叶域的带通滤波引入频域特征。
  • 引入门控选择单元以自适应地融合前向、后向的空间特征与频域特征。
  • 在DenseVNet的每个CNN层之后集成 Tim 块以实现多尺度特征建模。

实验结果

研究问题

  • RQ1在不过度计算的前提下,如何高效地对 2D/3D 的牙齿 CBCT 分割建模长程依赖?
  • RQ2添加频域特征是否提高对 CBCT 图像中噪声和伪影的鲁棒性?
  • RQ3数据依赖的门控机制能否稳健地融合空间特征与频域特征以实现准确的牙齿分割?

主要发现

  • T-Mamba 在公开牙齿 CBCT 数据集上实现了跨多个指标的最先进结果。
  • IoU 提升 3.63 个百分点,SO 提升 2.43 个百分点,DSC 提升 2.30 个百分点,超过之前的 SOTA。
  • Hausdorff 距离 (HD) 降低 4.39 mm,ASSD 降低 0.37 mm。
  • 消融实验显示 共享的双位置编码和 Gate Selection Unit 对性能贡献显著。
  • Tim 块结合频域特征在关键指标上优于基线 DenseVNet 与 Vim 变体。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。