QUICK REVIEW

[論文レビュー] Deep learning and its application to medical image segmentation

Holger R. Roth, Chen Shen|arXiv (Cornell University)|Mar 23, 2018

Radiomics and Machine Learning in Medical Imaging参考文献 32被引用数 61

ひとこと要約

論文は腹部CTにおける自動的な多臓器セグメンテーションのための3D完全畳み込みネットワーク（3D U-Net）を開発し、胃がんCTデータセットにおいてほぼ最先端に近いDiceスコアを報告します。

ABSTRACT

One of the most common tasks in medical imaging is semantic segmentation. Achieving this segmentation automatically has been an active area of research, but the task has been proven very challenging due to the large variation of anatomy across different patients. However, recent advances in deep learning have made it possible to significantly improve the performance of image recognition and semantic segmentation methods in the field of computer vision. Due to the data driven approaches of hierarchical feature learning in deep learning frameworks, these advances can be translated to medical images without much difficulty. Several variations of deep convolutional neural networks have been successfully applied to medical images. Especially fully convolutional architectures have been proven efficient for segmentation of 3D medical images. In this article, we describe how to build a 3D fully convolutional network (FCN) that can process 3D images in order to produce automatic semantic segmentations. The model is trained and evaluated on a clinical computed tomography (CT) dataset and shows state-of-the-art performance in multi-organ segmentation.

研究の動機と目的

腹部CTにおける大きな解剖学的変動にもかかわらず自動セマンティックセグメンテーションの課題に対処する。
volumetric medical image segmentationのためのU-Netに触発された3D完全畳み込みネットワークアーキテクチャを提案する。
データ拡張とDiceベースの損失を用いたエンドツーエンド訓練を実証し、セグメンテーション性能を最適化する。
胃がんCTデータセット上で複数の臓器にわたりモデルを評価し、最先端手法と比較する。

提案手法

エンコーダ−デコーダを対称とした3D FCN（3D U-Netに類似）を採用し、エンコーダは3x3x3カーネルと2x2x2最大プーリング、デコーダは転置畳み込みを使用。
エンコーダとデコーダを対応解像度でスキップ接続し、高解像度特徴を保持。
ランダムに切り出したサブボリューム（64x64x64）で単一GPUで訓練し、バッチ正規化とAdam最適化、マルチクラスセグメンテーションのDiceベースの損失を使用。
滑らかなスプライン変形、ランダム回転・平行移動を含むデータ拡張を適用して頑健性を向上させ、過学習を抑制。
推論時にはボリューム全体をオーバーラップするタイル処理で出力を入力サイズに揃えて処理。
ボクセル-wise softmaxを用いて各ボクセルのクラス確率を生成し、総損失を各クラスDice損失の加重和として計算（本研究では重みを1に設定）。

実験結果

リサーチクエスチョン

RQ1エンドツーエンドで訓練された3D完全畳み込みネットワークは腹部CTにおける正確な多臓器セグメンテーションを達成できるか。
RQ2提案された3D FCNを用いて動脈、静脈、肝臓、脾臓、胃、胆嚢、膵臓でどのDiceスコアが得られるか。
RQ3データ拡張とネットワークアーキテクチャ（スキップ接続を含む）がセグメンテーション性能と頑健性にどのように影響するか。
RQ43D U-Net様のFCNはCT臓器セグメンテーションの他の最先端手法と比較して競争力があるか。

主な発見

Dataset	Organ	Dice (%)	Avg.	Std.	Min.
Training	動脈	84.1	5.0	66.9	91.7
Training	静脈	77.5	8.9	29.2	89.2
Training	肝臓	96.6	1.1	91.4	98.5
Training	脾臓	96.3	2.0	79.8	98.9
Training	胃	95.6	7.7	0.0	99.7
Training	胆嚢	90.1	10.9	0.0	97.8
Training	膵臓	85.5	8.9	28.0	95.5
Testing	動脈	83.5	4.1	73.7	91.1
Testing	静脈	80.5	6.8	49.0	89.4
Testing	肝臓	97.1	1.0	93.5	98.3
Testing	脾臓	97.7	0.8	95.2	98.9
Testing	胃	96.1	7.9	49.4	98.9
Testing	胆嚢	85.1	15.7	28.6	97.4
Testing	膵臓	84.9	9.1	52.5	95.1

訓練データでの平均Diceスコアは全臓器で89.4%（各臓器のばらつき）。
テストデータでの平均Diceスコアは全臓器で89.3%（各臓器のばらつき）。
テスト時の臓器別Dice：動脈83.5%、静脈80.5%、肝臓97.1%、脾臟97.7%、胃96.1%、胆嚢85.1%、膵臓84.9%。
モデルは約1900万パラメータの3D体積セグメンテーションにスケールし、ケースあたり1分未満で推論可能。
データ拡張とボリュームベースの処理は過学習を抑制し、全CTボリュームでの頑健な多臓器セグメンテーションを実現。
このアプローチは3D腹部CT臓器セグメンテーションの他の最先端アーキテクチャと比較して競争力のある性能を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。