Skip to main content
QUICK REVIEW

[论文解读] Redefining the Down-Sampling Scheme of U-Net for Precision Biomedical Image Segmentation

Mingjie Li, Yizheng Chen|arXiv (Cornell University)|Feb 23, 2026
Advanced Neural Network Applications被引用 0
一句话总结

本文提出 Stair Pooling,一种用于 U-Net 的下采样策略,通过一系列小的窄池化操作来保留信息,在 BIS 基准上提升约 3.8% Dice,并通过基于传输熵的路径优化降低路径数量。

ABSTRACT

U-Net architectures have been instrumental in advancing biomedical image segmentation (BIS) but often struggle with capturing long-range information. One reason is the conventional down-sampling techniques that prioritize computational efficiency at the expense of information retention. This paper introduces a simple but effective strategy, we call it Stair Pooling, which moderates the pace of down-sampling and reduces information loss by leveraging a sequence of concatenated small and narrow pooling operations in varied orientations. Specifically, our method modifies the reduction in dimensionality within each 2D pooling step from $\frac{1}{4}$ to $\frac{1}{2}$. This approach can also be adapted for 3D pooling to preserve even more information. Such preservation aids the U-Net in more effectively reconstructing spatial details during the up-sampling phase, thereby enhancing its ability to capture long-range information and improving segmentation accuracy. Extensive experiments on three BIS benchmarks demonstrate that the proposed Stair Pooling can increase both 2D and 3D U-Net performance by an average of 3.8\% in Dice scores. Moreover, we leverage the transfer entropy to select the optimal down-sampling paths and quantitatively show how the proposed Stair Pooling reduces the information loss.

研究动机与目标

  • Motivate improvements in U-Net to capture long-range information without heavy attention-based models.
  • Propose Stair Pooling to slow down down-sampling and preserve critical features.
  • Extend Stair Pooling to 3D pooling for volumetric BIS tasks.
  • Introduce transfer entropy to select optimal down-sampling paths.
  • Demonstrate performance gains on 2D and 3D BIS benchmarks.

提出的方法

  • Replace single 2x2 pooling with a sequence of concatenated 1x2 and 2x1 pooling operations.
  • Follow each pooling with a convolution and ReLU to break linear correlations between paths.
  • Fuse features from all pooling paths to form the final down-sampled representation.
  • Reduce 2D down-sampling dimensionality from 1/4 to 1/2 per step, extending to 3D pooling.
  • Compute entropy of feature maps (Gaussian approx) and transfer entropy TE to select optimal down-sampling paths.
  • Exhaustive search for the best down-sampling paths in 2D; discuss TE-guided path pruning.
Figure 1 : The overview of our proposed Stair Pooling. It splits the original max pooling layer into a series of concatenated small and narrow pooling kernels. To break the linear relationship, each pooling operation is followed by a convolutional layer and a ReLU activation.
Figure 1 : The overview of our proposed Stair Pooling. It splits the original max pooling layer into a series of concatenated small and narrow pooling kernels. To break the linear relationship, each pooling operation is followed by a convolutional layer and a ReLU activation.

实验结果

研究问题

  • RQ1Can Stair Pooling improve U-Net performance on BIS by preserving long-range information during down-sampling?
  • RQ2How does the TE-based path selection affect model efficiency and segmentation accuracy?
  • RQ3Does Stair Pooling extend effectively to 3D BIS tasks and what are the trade-offs?
  • RQ4Which down-sampling paths are preferred across 2D versus 3D BIS datasets?
  • RQ5What is the overall Dice gain across standard BIS benchmarks when using Stair Pooling?

主要发现

  • Stair Pooling increases average Dice scores by 3.8% on BIS benchmarks.
  • The TE-selected variant achieves higher overall Dice and improves several organ-specific metrics.
  • Stair Pooling outperforms other pooling strategies (e.g., Haar wavelet, pyramid pooling) on 2D BIS tasks.
  • The TE-based path optimization can reduce model size while maintaining or increasing performance (example on Synapse).
  • 3D Stair Pooling extends to volumetric BIS with strong 3D segmentation results (KiTS23).
  • Optimal down-sampling paths show 2D datasets prefer horizontal-first pooling, while the 3D KiT23 dataset favors initial z-axis pooling.
Figure 2 : Qualitative comparison of different approaches on the Synapse dataset. From left to right: Ground Truth, U-Net, SwinUnet, UNet with HWT pooling, our SP UNet and the TE selected variant.
Figure 2 : Qualitative comparison of different approaches on the Synapse dataset. From left to right: Ground Truth, U-Net, SwinUnet, UNet with HWT pooling, our SP UNet and the TE selected variant.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。