QUICK REVIEW

[论文解读] U-Net v2: Rethinking the Skip Connections of U-Net for Medical Image Segmentation

Yaopeng Peng, Milan Sonka|arXiv (Cornell University)|Nov 29, 2023

Brain Tumor Detection and Classification被引用 46

一句话总结

U-Net v2 引入语义与细节注入（SDI）模块，通过 Hadamard 乘积在更高层语义与更低层细节之间细化多级编码器特征，从而提高医学图像分割的准确性，同时保持高效计算。

ABSTRACT

In this paper, we introduce U-Net v2, a new robust and efficient U-Net variant for medical image segmentation. It aims to augment the infusion of semantic information into low-level features while simultaneously refining high-level features with finer details. For an input image, we begin by extracting multi-level features with a deep neural network encoder. Next, we enhance the feature map of each level by infusing semantic information from higher-level features and integrating finer details from lower-level features through Hadamard product. Our novel skip connections empower features of all the levels with enriched semantic characteristics and intricate details. The improved features are subsequently transmitted to the decoder for further processing and segmentation. Our method can be seamlessly integrated into any Encoder-Decoder network. We evaluate our method on several public medical image segmentation datasets for skin lesion segmentation and polyp segmentation, and the experimental results demonstrate the segmentation accuracy of our new method over state-of-the-art methods, while preserving memory and computational efficiency. Code is available at: https://github.com/yaoppeng/U-Net_v2

研究动机与目标

激发并解决传统跳跃连接在医学影像中在 U-Net 的局限性。
提出一种轻量级机制，在每个编码器等级将来自高层特征的语义信息与来自低层特征的细节进行融合。
证明经 SDI 增强的特征在皮肤病变和息肉数据集上提升分割准确性，同时不产生过多的内存或 FLOPs。

提出的方法

提取多级编码器特征。
对每个层级应用空间和通道注意力，随后进行 1x1 通道降维到一个共同的潜在维度。
将所有层级特征调整并平滑至相同分辨率，然后使用 Hadamard 乘积进行组合，产生用于解码器的 SDI 增强特征。
将 SDI 模块集成到任意 Encoder-Decoder 网络中，并在皮肤病变和息肉分割数据集上进行评估。

Fig. 1 : (a) The overall architecture of our U-Net v2 model, which consists of an Encoder, the SDI (semantics and detail infusion) module, and a Decoder. (b) The architecture of the SDI module. For simplicity, we only show the refinement of the third level features ( $l=3$ ). SmoothConv denotes a $3

实验结果

研究问题

RQ1SDI 模块是否通过注入更高层次的语义和更低层次的细节来提升逐层特征的丰富度？
RQ2在每个编码器层级基于 Hadamard 乘积的融合是否能在保持计算效率的同时获得更好的分割效果？
RQ3与最先进方法相比，带 SDI 的 U-Net v2 在 ISIC 皮肤病变数据集和公开的息肉分割数据集上的表现如何？

主要发现

数据集	方法	DSC (%)	IoU (%)	MAE（用于息肉数据集）
ISIC 2017	U-Net (baseline)	86.99	76.98
ISIC 2017	TransFuse	88.40	79.21
ISIC 2017	MALUNet	88.13	78.78
ISIC 2017	EGE-UNet	88.77	79.81
ISIC 2017	U-Net v2 (ours)	90.21	82.17
ISIC 2018	U-Net (baseline)	87.55	77.86
ISIC 2018	UNet++	87.83	78.31
ISIC 2018	TransFuse	89.27	80.63
ISIC 2018	SANet	88.59	79.52
ISIC 2018	EGE-UNet	89.04	80.25
ISIC 2018	U-Net v2 (ours)	91.52	84.15
Polyp Datasets	Kvasir-SEG (PVT encoder)	92.8	88.0	0.019
Polyp Datasets	ClinicDB (PVT encoder)	94.4	89.6	0.006
Polyp Datasets	ColonDB (PVT encoder)	81.2	73.1	0.030
Polyp Datasets	ETIS (PVT encoder)	79.0	70.5	0.013
Polyp Datasets	Endoscene (PVT encoder)	89.7	83.1	0.007

U-Net v2 在 ISIC 2017 和 2018 上超过了若干最先进方法，DSC 分别提升了 1.44% 和 2.48%，IoU 分别提升了 2.36% 和 3.90%。
在息肉数据集上，U-Net v2 超越 Poly-PVT 等方法，在 Kvasir-SEG、ClinicDB、ColonDB 和 ETIS 上实现了稳定的 DSC 和 IoU 提升。
消融研究表明 SDI 模块对性能贡献最大，移除 SDI 会降低结果；仅移除跳跃连接也会降低性能，凸显了 SDI 相对于基于拼接的融合的优势。
与使用 PVT 主干的 U-Net 和 UNet++ 基线相比，U-Net v2 在分割性能具竞争力或更优，同时保持较低的显存使用和有利的 FLOPs 与 FPS。

Fig. 2 : Example segmentations from ISIC 2017 dataset. We use PVT as the encoder for U-Net and UNet++.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。