QUICK REVIEW

[논문 리뷰] PraNet: Parallel Reverse Attention Network for Polyp Segmentation

Deng-Ping Fan, Ge-Peng Ji|arXiv (Cornell University)|2020. 06. 13.

Advanced Image and Video Retrieval Techniques참고 문헌 38인용 수 109

한 줄 요약

PraNet은 Parallel Partial Decoder 및 Reverse Attention 모듈을 도입하여 대장경 내시경 영상에서 실시간 고정밀 폴립 분할을 달성하고 데이터셋 간 일반화를 향상시킵니다.

ABSTRACT

Colonoscopy is an effective technique for detecting colorectal polyps, which are highly related to colorectal cancer. In clinical practice, segmenting polyps from colonoscopy images is of great importance since it provides valuable information for diagnosis and surgery. However, accurate polyp segmentation is a challenging task, for two major reasons: (i) the same type of polyps has a diversity of size, color and texture; and (ii) the boundary between a polyp and its surrounding mucosa is not sharp. To address these challenges, we propose a parallel reverse attention network (PraNet) for accurate polyp segmentation in colonoscopy images. Specifically, we first aggregate the features in high-level layers using a parallel partial decoder (PPD). Based on the combined feature, we then generate a global map as the initial guidance area for the following components. In addition, we mine the boundary cues using a reverse attention (RA) module, which is able to establish the relationship between areas and boundary cues. Thanks to the recurrent cooperation mechanism between areas and boundaries, our PraNet is capable of calibrating any misaligned predictions, improving the segmentation accuracy. Quantitative and qualitative evaluations on five challenging datasets across six metrics show that our PraNet improves the segmentation accuracy significantly, and presents a number of advantages in terms of generalizability, and real-time segmentation efficiency.

연구 동기 및 목표

자동적이고 정확한 폴립 분할을 촉진하여 대장 폴리크 스크리닝을 보조합니다.
폴립 내부의 넓은 외관 변화와 흐릿한 폴립 경계 경계를 다룹니다.
실시간 대장내시경 비디오에 적합한 빠르고 일반화 가능한 네트워크를 개발합니다.
새로운 구조 구성 요소를 통해 영역 신호와 경계 신호를 활용하여 정밀도를 향상시킵니다.

제안 방법

고수준 특징을 모아 글로벌 폴립 맵을 생성하기 위해 parallel partial decoder (PPD)를 사용합니다.
예측된 폴립 영역을 지우고 예측을 다듬어 경계 신호를 점진적으로 추출하기 위해 다수의 reverse attention (RA) 모듈을 활용합니다.
전역 및 사이드 출력 맵에 적용된 가중 IoU와 가중 BCE를 결합한 손실로 학습합니다.
학습 안정성과 엔드투엔드 학습을 가능하게 하기 위해 다중 출력에 대한 심층 감독을 채택합니다.
352x352 입력에서 실시간 성능(~50 fps)을 유지합니다.

실험 결과

연구 질문

RQ1병렬 디코딩 전략이 폴립 분할을 위한 고수준 특징 집계를 개선할 수 있는가?
RQ2역전 주의 메커니즘이 이전 예측을 지워 경계를 효과적으로 다듬는가?
RQ3PraNet이 다양한 폴립 데이터셋에서 실시간 추론과 탁월한 정확도를 달성할 수 있는가?
RQ4PPD와 RA가 학습 속도와 분할 품질을 모두 향상시키도록 어떻게 상호작용하는가?

주요 결과

Methods	mean Dice	mean IoU	Fβ^w	Sα	Eφ^max	MAE
Kvasir U-Net (MICCAI’15)	0.818	0.746	0.794	0.858	0.893	0.055
Kvasir U-Net++ (TMI’19)	0.821	0.743	0.808	0.862	0.910	0.048
Kvasir ResUNet-mod †	0.791	n/a	n/a	n/a	n/a	n/a
Kvasir ResUNet++ †	0.813	0.793	n/a	n/a	n/a	n/a
Kvasir SFA (MICCAI’19) [10]	0.723	0.611	0.670	0.782	0.849	0.075
Kvasir PraNet (Ours)	0.898	0.840	0.885	0.915	0.948	0.030
CVC-612 U-Net (MICCAI’15)	0.823	0.755	0.811	0.889	0.954	0.019
CVC-612 U-Net++ (TMI’19)	0.794	0.729	0.785	0.873	0.931	0.022
CVC-612 SFA (MICCAI’19) [10]	0.700	0.607	0.647	0.793	0.885	0.042
CVC-612 PraNet (Ours)	0.899	0.849	0.896	0.936	0.979	0.009

PraNet은 다섯 데이터셋과 여러 지표에서 최첨단 방법을 능가한다.
On Kvasir, PraNet achieves mean Dice of 0.898 and mean IoU of 0.840.
On CVC-612, PraNet achieves mean Dice of 0.899 and mean IoU of 0.849.
PraNet demonstrates strong generalization to unseen datasets, with significant gains over baselines.
Inference speed is real-time at ~50 fps for 352x352 inputs, with training convergence in ~20 epochs (~0.5 hours).
Ablation shows PPD and RA contribute additively to performance, with the full combination (PPD+RA+Backbone) yielding the best results.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.