QUICK REVIEW

[논문 리뷰] GRAFNet: Multiscale Retinal Processing via Guided Cortical Attention Feedback for Enhancing Medical Image Polyp Segmentation

Abdul Joseph Fofanah, Lian Wen|arXiv (Cornell University)|2026. 02. 15.

Retinal Imaging and Analysis인용 수 0

한 줄 요약

GRAFNet은 Guided Asymmetric Attention, Multiscale Retinal Modules, 및 Guided Cortical Attention Feedback으로 생물학적으로 영감을 받은 아키텍처를 도입하여 폴립 분할 성능을 향상시키고, 다섯 개 벤치마크에서 최첨단 결과와 더 나은 일반화를 달성한다.

ABSTRACT

Accurate polyp segmentation in colonoscopy is essential for cancer prevention but remains challenging due to: (1) high morphological variability (from flat to protruding lesions), (2) strong visual similarity to normal structures such as folds and vessels, and (3) the need for robust multi-scale detection. Existing deep learning approaches suffer from unidirectional processing, weak multi-scale fusion, and the absence of anatomical constraints, often leading to false positives (over-segmentation of normal structures) and false negatives (missed subtle flat lesions). We propose GRAFNet, a biologically inspired architecture that emulates the hierarchical organisation of the human visual system. GRAFNet integrates three key modules: (1) a Guided Asymmetric Attention Module (GAAM) that mimics orientation-tuned cortical neurones to emphasise polyp boundaries, (2) a MultiScale Retinal Module (MSRM) that replicates retinal ganglion cell pathways for parallel multi-feature analysis, and (3) a Guided Cortical Attention Feedback Module (GCAFM) that applies predictive coding for iterative refinement. These are unified in a Polyp Encoder-Decoder Module (PEDM) that enforces spatial-semantic consistency via resolution-adaptive feedback. Extensive experiments on five public benchmarks (Kvasir-SEG, CVC-300, CVC-ColonDB, CVC-Clinic, and PolypGen) demonstrate consistent state-of-the-art performance, with 3-8% Dice improvements and 10-20% higher generalisation over leading methods, while offering interpretable decision pathways. This work establishes a paradigm in which neural computation principles bridge the gap between AI accuracy and clinically trustworthy reasoning. Code is available at https://github.com/afofanah/GRAFNet.

연구 동기 및 목표

다양한 형태학 및 영상 조건에서 대장 내시경 검사에서 정확한 폴립 분할을 촉진한다.
망막 경로와 피질 피드백을 통합하는 생물학적으로 타당한 아키텍처를 개발한다.
해상도 적응 피드백을 가진 인코더–디코더를 통해 공간-시맨틱 일관성을 강제한다.
주목 기반의 피드백 구동 처리로 해석 가능한 의사 결정 경로를 제공한다.

제안 방법

경계 강화 주의에 대한 방향 조정 V1 뉴런을 에뮬레이션하기 위해 GAAM를 도입한다.
다중 특징 분석을 위한 망막 평행 경로(parvocellular, magnocellular, koniocellular, ON–OFF)를 재현하기 위해 MSRM을 구현한다.
예측 부호화를 적용하고 고수준 해부학적 선행 정보를 사용하여 특징을 다듬기 위해 GCAFM를 추가한다.
계층적이고 해상도 적응 피드백 조정을 위한 Polyp Encoder–Decoder Module (PEDM)을 삽입한다.
세분화 손실과 피드백 일관성 및 주의 지침 항을 결합한 생체 영감을 받은 손실 LBIO로 학습한다.

실험 결과

연구 질문

RQ1RQ1: 피질 피드백이 표준 주의 및 최첨단 방법에 비해 분할 성능을 향상시키는가?
RQ2RQ2: 다중 스케일 망막 경로가 정상 해부학에서의 위양성(false positives)을 줄이는가?
RQ3RQ3: 비대칭(방향 튜닝) 주의가 미세한 평평한 병변 탐지에 도움이 되는가?
RQ4RQ4: 가이드된 피드백이 스케일 간 주의 집중 흐름을 방지하는가?
RQ5RQ5: 각 생물학적 모듈의 성능 기여도(무 Ablation) 는 무엇인가?
RQ6RQ6: 신경생물학적 설계가 교차 데이터셋 일반화를 개선하는가?

주요 결과

GRAFNet은 다섯 개 데이터세트에서 최첨단 분할 성능을 달성하며, Dice 개선 3–8% 및 선두 방법 대비 일반화가 10–20% 더 높다.
CVC-ClinicDB 및 Kvasir-SEG에서 Dice 점수는 각각 0.9290 및 0.9146에 도달하고, BF1은 약 0.9090 및 0.9163이다.
CVC-ColonDB, CVC-300에서 다수 지표에서 최상위 혹은 근소한 최상위 점수를 달성하며, Dice 및 IoU 포함(일부 비교에서 Dice 최대 0.9461 등).
무 Ablation은 MSRM이 먼저 상당한 이득을 제공하고 그 뒤를 GAAM과 GCAFM이 이고, 최종적으로 전체 GRAFNet이 ClinicDB/Kvasir-SEG에서 Dice 0.9425, 각각 ClinicDB/Kvasir-SEG에서, 그리고 CVC-ColonDB/CVC-300에서 각각 Dice 0.9461/0.8896를 달성한다.
GRAFNet은 정상 해부학에서의 위양성(FPR) 감소 및 높은 음성 예측값(NPV)으로 위양성 감소를 보이고, 스케일 간 주의 안정성(높은 AC/SC 점수)이 강하다.
미세한 평평한 병변 및 작은 폴립은 비대칭 주의의 이점을 보이며, 평평한 병변(<3 mm) 및 미묘한 병변(3–5 mm) 카테고리에서 일관된 Dice 이득을 얻는다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.