[Paper Review] HarDNet-MSEG: A Simple Encoder-Decoder Polyp Segmentation Neural Network that Achieves over 0.9 Mean Dice and 86 FPS
HarDNet-MSEG uses a HarDNet68 backbone with a cascaded partial decoder to deliver state-of-the-art polyp segmentation accuracy (mean Dice >0.9 on Kvasir-SEG) with high speed (86 FPS).
We propose a new convolution neural network called HarDNet-MSEG for polyp segmentation. It achieves SOTA in both accuracy and inference speed on five popular datasets. For Kvasir-SEG, HarDNet-MSEG delivers 0.904 mean Dice running at 86.7 FPS on a GeForce RTX 2080 Ti GPU. It consists of a backbone and a decoder. The backbone is a low memory traffic CNN called HarDNet68, which has been successfully applied to various CV tasks including image classification, object detection, multi-object tracking and semantic segmentation, etc. The decoder part is inspired by the Cascaded Partial Decoder, known for fast and accurate salient object detection. We have evaluated HarDNet-MSEG using those five popular datasets. The code and all experiment details are available at Github. https://github.com/james128333/HarDNet-MSEG
Motivation & Objective
- Motivate fast, accurate polyp segmentation for CRC prevention via colonoscopy imaging.
- Propose a simple encoder-decoder architecture with a memory-efficient backbone.
- Evaluate on five major polyp datasets to establish SOTA accuracy and speed.
- Compare against U-Net, PraNet, and other leading models to quantify gains in Dice, IoU, and FPS.
Proposed method
- Adopt HarDNet68 as backbone to reduce memory traffic and increase inference speed.
- Use a simple encoder-decoder architecture with a cascaded partial decoder inspired by fast salient object detection.
- Incorporate a Receptive Field Block (RFB) in skip connections to enlarge receptive fields.
- Apply dense aggregation via element-wise multiplication after up-sampling to fuse features.
- Train with two differing settings inspired by prior works to ensure robust comparison across datasets.
Experimental results
Research questions
- RQ1Can HarDNet-MSEG surpass current SOTA polyp segmentation methods in mean Dice and IoU across standard datasets?
- RQ2Does a simple encoder-decoder with HarDNet68 backbone achieve competitive accuracy while maintaining high inference speed?
- RQ3What is the impact of a cascaded partial decoder and RFB-enabled skip connections on boundary accuracy and small polyp segmentation?
- RQ4How does HarDNet-MSEG perform on Kvasir-SEG, CVC-ColonDB, EndoScene, ETIS-Larib Polyp DB, and CVC-Clinic DB relative to PraNet and U-Net variants?
Key findings
- HarDNet-MSEG achieves state-of-the-art mean Dice and mIoU across all five datasets tested.
- On Kvasir-SEG, it delivers 0.904 mean Dice at 86.7 FPS on an RTX 2080 Ti.
- Consistently outperforms U-Net[ResNet34] and PraNet in mean Dice and mIoU metrics.
- Demonstrates faster inference (FPS) than several competing models while maintaining or improving accuracy.
- Demonstrates strong boundary delineation and overall segmentation quality in qualitative results.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.