QUICK REVIEW

[論文レビュー] Class-Aware Adversarial Transformers for Medical Image Segmentation

Chenyu You, Ruihan Zhao|PubMed|Jan 26, 2022

Radiomics and Machine Learning in Medical Imaging参考文献 24被引用数 78

ひとこと要約

CASTformerは、クラス認識型トランスフォーマーモジュールとトランスフォーマーに基づく判別器を備えたピラミッド型のマルチスケール生成器を用いる、2D医用画像セグメンテーションのGANベースのトランスフォーマーフレームワークである。これによりセグメンテーション精度を向上させる。SynapseおよびLiTSデータセットで最先端の結果を達成し、DiceおよびJaccardスコアの顕著な改善を示している。

ABSTRACT

Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models suffer from several disadvantages: (1) existing methods fail to capture the important features of the images due to the naive tokenization scheme; (2) the models suffer from information loss because they only consider single-scale feature representations; and (3) the segmentation label maps generated by the models are not accurate enough without considering rich semantic contexts and anatomical textures. In this work, we present CASTformer, a novel type of adversarial transformers, for 2D medical image segmentation. First, we take advantage of the pyramid structure to construct multi-scale representations and handle multi-scale variations. We then design a novel class-aware transformer module to better learn the discriminative regions of objects with semantic structures. Lastly, we utilize an adversarial training strategy that boosts segmentation accuracy and correspondingly allows a transformer-based discriminator to capture high-level semantically correlated contents and low-level anatomical features. Our experiments demonstrate that CASTformer dramatically outperforms previous state-of-the-art transformer-based approaches on three benchmarks, obtaining 2.54%-5.88% absolute improvements in Dice over previous models. Further qualitative experiments provide a more detailed picture of the model's inner workings, shed light on the challenges in improved transparency, and demonstrate that transfer learning can greatly improve performance and reduce the size of medical image datasets in training, making CASTformer a strong starting point for downstream medical image analysis tasks.

研究の動機と目的

既存のトランスフォーマー系医用セグメンテーションモデルが抱えるマルチスケール性・意味論的/文脈的制約に対処することで、セグメンテーションの改善を動機づける。
ピラミッド構造の生成器、クラス認識型トランスフォーマーモジュール、およびGANベースの訓練を組み合わせたCASTformerを提案し、グローバルおよびローカル特徴学習の強化を図る。
複数の医用画像ベンチマークで性能向上を実証し、転移学習および各コンポーネントの寄与を分析する。

提案手法

GANフレームワークを導入し、トランスフォーマーベースの生成器（CATformer）とVision Transformersから事前学習された判別器を備える。
セグメンテーションのためのマルチスケール表現を学習するために特徴ピラミッドを組み込む。
鑑別的な解剖学的領域を反復的にサンプリングするクラス認識型トランスフォーマーモジュール（CAT）を開発する。
長距離の文脈を捉えるトランスフォーマーエンコーダーモジュール（TEM）を採用する。
軽量な全MLPデコーダを用いて、効率的なマルチスケールの融合とマスク予測を行う。
現実味と精度のバランスをとるため、WGAN-GP目的関数とセグメンテーション損失（Diceおよびクロスエントロピー）で訓練する。

実験結果

リサーチクエスチョン

RQ1ピラミッド構造のトランスフォーマーべース生成器は、単一スケール手法よりもマルチスケールの医用画像セグメンテーションを改善できるか？
RQ2トランスフォーマー内のクラス認識型サンプリング戦略は、解剖学的に意味のある領域の局在化を向上させるか？
RQ3敵対的学習とトランスフォーマーべースの判別器は、医用画像におけるセグメンテーションの忠実度と意味論的一貫性を向上させるか？
RQ4転移学習と事前学習済みCVバックボーンは、限られた医用データセットの性能にどのように影響するか？
RQ5個々のCASTformerコンポーネント（CATモジュール、TEM、GAN訓練）の全体性能への寄与はどの程度か？

主な発見

フレームワーク	平均	大動脈	胆嚢	左腎	右腎	肝臓	胰腺	脾臓	胃	DSC	Jaccard	95HD	ASD
CASTformer (ours)	82.55	74.69	22.73	5.81	89.05	67.48	86.05	82.17	95.61	67.49	91.00	81.55
CATformer (ours)	82.17	73.22	16.20	4.28	88.98	67.16	85.72	81.69	95.34	66.53	90.74	81.20

CASTformerはSynapseでDice 82.55およびJaccard 74.69を達成し、最先端の性能を示す（CASTformer行）。
LiTSではDice 73.82%およびJaccard 64.91%を達成し、TransUNetをDiceで5.88%、Jaccardで4.66%上回る。
CATformer（GANなし）も従来法を上回り、Synapseで Dice 82.17 および Jaccard 73.22。
CV事前学習済みバックボーンからの転移学習は性能を大幅に向上させ、小規模データセットで特に顕著である。
アブレーションにより、クラス認識型トランスフォーマーとTEMの両方が有意に寄与することが示され、いずれかを除くとDiceの改善が低下する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。