QUICK REVIEW

[論文レビュー] T-Mamba: A unified framework with Long-Range Dependency in dual-domain for 2D & 3D Tooth Segmentation

Jing Hao, Zhu, Yonghui|arXiv (Cornell University)|Apr 1, 2024

Dental Radiography and Imaging被引用数 8

ひとこと要約

T-Mamba は Tooth Vision Mamba (Tim) ブロックを DenseVNet に統合し、Tooth CBCT の全体的および局所的文脈をモデル化して、公開 Tooth CBCT データで最先端の結果を達成します。

ABSTRACT

Tooth segmentation is a pivotal step in modern digital dentistry, essential for applications across orthodontic diagnosis and treatment planning. Despite its importance, this process is fraught with challenges due to the high noise and low contrast inherent in 2D and 3D tooth data. Both Convolutional Neural Networks (CNNs) and Transformers has shown promise in medical image segmentation, yet each method has limitations in handling long-range dependencies and computational complexity. To address this issue, this paper introduces T-Mamba, integrating frequency-based features and shared bi-positional encoding into vision mamba to address limitations in efficient global feature modeling. Besides, we design a gate selection unit to integrate two features in spatial domain and one feature in frequency domain adaptively. T-Mamba is the first work to introduce frequency-based features into vision mamba, and its flexibility allows it to process both 2D and 3D tooth data without the need for separate modules. Also, the TED3, a large-scale public tooth 2D dental X-ray dataset, has been presented in this paper. Extensive experiments demonstrate that T-Mamba achieves new SOTA results on a public tooth CBCT dataset and outperforms previous SOTA methods on TED3 dataset. The code and models are publicly available at: https://github.com/isbrycee/T-Mamba.

研究の動機と目的

ノイズやアーティファクトの下で 3D CBCT における正確な歯のセグメンテーションを動機づける。
空間的位置を保持しつつ長距離依存性をモデリングするフレームワークを開発する。
医用画像における頑健な特徴表現を強化するために周波数領域の特徴を取り入れる。
空間特徴と周波数特徴を適応的に結合するゲートベースの融合機構を提案する。

提案手法

Tim ブロックで 2D/3D 特徴を 1-D シークエンスとして処理するように Vision Mamba を拡張する。
再整形を跨いで空間情報を保持するために共有のデュアル・ポジショナルエンコーディングを使用する。
フーリエ領域でのバンドパスフィルタリングを通じて周波数領域の特徴を取り入れる。
前向き・後ろ向きの空間特徴と周波数特徴を適応的に融合する Gate Selection Unit を導入する。
DenseVNet の各 CNN 層の後に Tim ブロックを組み込み、多段階の特徴モデリングを行う。

実験結果

リサーチクエスチョン

RQ1ノイズやアーティファクトがある中で、2D/3D tooth CBCT セグメンテーションの長距離依存性を過度な計算量なしに効率的にモデリングするにはどうすれば良いか？
RQ2周波数領域の特徴を加えることで CBCT 画像のノイズやアーティファクトに対する頑健性は向上するか？
RQ3データ依存のゲート機構は空間特徴と周波数特徴を正確な歯のセグメンテーションのために頑健に融合できるか？

主な発見

Method	FLOPs(G)	Params(M)	HD(mm)	ASSD(mm)	IoU(%)	SO(%)	DSC(%)
T-Mamba (Ours)	-	-	1.18	0.42	88.31	97.53	93.60

T-Mamba は公開 tooth CBCT データセット上で複数の指標で最先端の結果を達成。
IoU は前の SOTA より 3.63 ポイント上昇、SO は 2.43 ポイント、DSC は 2.30 ポイント上昇。
Hausdorff Distance (HD) は 4.39 mm 縮小、ASSD は 0.37 mm 縮小。
アブレーションは共有のデュアル・ポジショナルエンコーディングと Gate Selection Unit が性能に大きく寄与することを示す。
Tim block と周波数特徴を組み合わせた手法は、主要な指標でベースラインDenseVNetおよびVim系より優れている。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。

[論文レビュー] T-Mamba: A unified framework with Long-Range Dependency in dual-domain for 2D &amp; 3D Tooth Segmentation