QUICK REVIEW

[논문 리뷰] SAR-U-Net: squeeze-and-excitation block and atrous spatial pyramid pooling based residual U-Net for automatic liver CT segmentation.

Jinke Wang, Peiqing Lv|arXiv (Cornell University)|2021. 03. 11.

Radiomics and Machine Learning in Medical Imaging인용 수 3

한 줄 요약

이 논문은 압축 및 자극(Se) 블록을 통한 주의 기반 특징 재보정, 다중 척도의 맥락 집약을 위한 압축된 공간 히ер아르키컬 풀링(ASPP), 그리고 깊은 학습을 가능하게 하는 잔차 학습을 통합한 2D U-Net 변종인 SAR-U-Net을 제안한다. 이 모델은 LiTS17 및 SLiver07 데이터셋에서 최고 성능을 기록하며, 각각 95.71% 및 97.31%의 Dice 스코어를 달성하여 도전적인 간 CT 분할 상황에서 뛰어난 정확도와 강건성을 입증한다.

ABSTRACT

Background and objective: In this paper, a modified U-Net based framework is presented, which leverages techniques from Squeeze-and-Excitation (SE) block, Atrous Spatial Pyramid Pooling (ASPP) and residual learning for accurate and robust liver CT segmentation, and the effectiveness of the proposed method was tested on two public datasets LiTS17 and SLiver07. Methods: A new network architecture called SAR-U-Net was designed. Firstly, the SE block is introduced to adaptively extract image features after each convolution in the U-Net encoder, while suppressing irrelevant regions, and highlighting features of specific segmentation task; Secondly, ASPP was employed to replace the transition layer and the output layer, and acquire multi-scale image information via different receptive fields. Thirdly, to alleviate the degradation problem, the traditional convolution block was replaced with the residual block and thus prompt the network to gain accuracy from considerably increased depth. Results: In the LiTS17 experiment, the mean values of Dice, VOE, RVD, ASD and MSD were 95.71, 9.52, -0.84, 1.54 and 29.14, respectively. Compared with other closely related 2D-based models, the proposed method achieved the highest accuracy. In the experiment of the SLiver07, the mean values of Dice, VOE, RVD, ASD and MSD were 97.31, 5.37, -1.08, 1.85 and 27.45, respectively. Compared with other closely related models, the proposed method achieved the highest segmentation accuracy except for the RVD. Conclusion: The proposed model enables a great improvement on the accuracy compared to 2D-based models, and its robustness in circumvent challenging problems, such as small liver regions, discontinuous liver regions, and fuzzy liver boundaries, is also well demonstrated and validated.

연구 동기 및 목표

소규모, 불연속적이거나 명확하지 않은 간 영역이 존재하는 경우에도 자동 간 분할의 정확도와 강건성을 향상시키는 것.
표준 U-Net이 다중 척도의 맥락적 특징을 포착하는 데서의 한계와 분할 과정에서 불필요한 특징을 억제하지 못하는 문제를 해결하는 것.
매우 깊은 아키텍처에서 흔히 발생하는 열화 문제를 완화하여 깊은 신경망 학습을 향상시키는 것.
공개 벤치마크 데이터셋인 LiTS17 및 SLiver07에서 제안된 아키텍처의 효과성을 검증하는 것.

제안 방법

U-Net 인코더의 각 합성곱 레이어 뒤에 압축 및 자극(SE) 블록을 통합하여 작업에 관련된 채널을 강조함으로써 특징 맵을 적응적으로 재보정한다.
표준 전이 및 출력 레이어를 압축된 공간 히어르키컬 풀링(ASPP)으로 대체하여 다양한 비율의 병렬 확장된 합성곱을 통해 다중 척도의 맥락 정보를 포착한다.
표준 합성곱 블록을 잔차 블록으로 대체하여 더 깊은 신경망 아키텍처를 가능하게 하고 학습 중 기울기 소실 문제를 완화한다.
SE, ASPP 및 잔차 학습의 세 구성 요소를 통합하여 종단 간 간 분할을 위한 유일한 U-Net 기반 아키텍처인 SAR-U-Net을 구성한다.
두 개의 공개 간 CT 데이터셋에서 교차 엔트로피 손실과 Dice 손실을 사용한 표준 지도 학습을 통해 네트워크를 훈련시킨다.

실험 결과

연구 질문

RQ1SE 블록의 통합이 관련 채널에 집중함으로써 간 CT 분할에서 특징 표현을 향상시킬 수 있는가?
RQ2ASPP가 간 CT 영상에서 다중 척도의 맥락적 특징을 포착함으로써 분할 성능을 향상시킬 수 있는가?
RQ3잔차 학습을 통해 열화 문제 없이 더 깊고 정확한 U-Net 아키텍처를 구현할 수 있는가?
RQ4어려운 간 CT 케이스에서 기존의 2D 기반 모델과 비교해 SAR-U-Net의 분할 정확도와 강건성은 어떠한가?

주요 결과

LiTS17 데이터셋에서 SAR-U-Net은 평균 Dice 스코어 95.71%를 기록하여 다른 2D 기반 모델보다 분할 정확도에서 뛰어난 성능을 보였다.
LiTS17에서 모델은 평균 VOE 9.52%, RVD -0.84%, ASD 1.54 mm, MSD 29.14 mm를 기록하여 높은 겹침 비율과 낮은 표면 거리 오차를 나타냈다.
SLiver07 데이터셋에서 SAR-U-Net은 유사 모델들 중 가장 높은 Dice 스코어 97.31%를 기록했으며, VOE 5.37%, RVD -1.08%, ASD 1.85 mm, MSD 27.45 mm를 기록했다.
소규모 간 영역, 불연속적인 간 구조, 흐린 경계와 같은 어려운 케이스 처리에서도 모델이 강건성을 보였다.
SE, ASPP 및 잔차 학습의 조합은 표준 U-Net 및 관련 아키텍처에 비해 분할 성능을 크게 향상시켰다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.