QUICK REVIEW

[论文解读] UNeXt: MLP-based Rapid Medical Image Segmentation Network

Jeya Maria Jose Valanarasu, Vishal M. Patel|arXiv (Cornell University)|Mar 9, 2022

Advanced Neural Network Applications被引用 54

一句话总结

UNeXt 是一个先卷积干道再接 tokenized MLP 块用于医学影像分割，在参数更少、计算量更低的情况下达到最先进的性能，相较 TransUNet 和 UNet 变体，能够实现更快的 CPU 推理，便于就地临床使用。

ABSTRACT

UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively adopted for rapid image segmentation in point-of-care applications as they are parameter-heavy, computationally complex and slow to use. To this end, we propose UNeXt which is a Convolutional multilayer perceptron (MLP) based network for image segmentation. We design UNeXt in an effective way with an early convolutional stage and a MLP stage in the latent stage. We propose a tokenized MLP block where we efficiently tokenize and project the convolutional features and use MLPs to model the representation. To further boost the performance, we propose shifting the channels of the inputs while feeding in to MLPs so as to focus on learning local dependencies. Using tokenized MLPs in latent space reduces the number of parameters and computational complexity while being able to result in a better representation to help segmentation. The network also consists of skip connections between various levels of encoder and decoder. We test UNeXt on multiple medical image segmentation datasets and show that we reduce the number of parameters by 72x, decrease the computational complexity by 68x, and improve the inference speed by 10x while also obtaining better segmentation performance over the state-of-the-art medical image segmentation architectures. Code is available at https://github.com/jeya-maria-jose/UNeXt-pytorch

研究动机与目标

在受限计算资源条件下推动就地图像分割。
开发一个将卷积与 tokenized MLP 组件相结合的轻量级编码器-解码器架构。
引入带方位移的 Tokenized MLP 块，以高效建模潜在表征。
在保持或改善分割精度的同时，显著降低参数量和 FLOPs。

提出的方法

一个两阶段架构，初始卷积阶段随后是 Tokenized MLP 阶段。
Tokenized MLP 块将卷积特征投射到 token，并应用带位移的 MLP 来建模局部依赖。
轴向位移（W 和 H）在 tokenization 之前引入局部性，Tok-MLP 块中使用深度卷积和 GELU 激活。
在 Tokenized MLP 块内使用残差连接和层归一化。
编码器与解码器之间的跳跃连接 mirror UNet，而解码器使用 Tokenized MLP 块后跟卷积块。

实验结果

研究问题

RQ1带有 tokenized MLP 的卷积干在潜在空间是否能在保持分割精度的同时减少参数和计算？
RQ2Tok-MLP 块中的轴向位移是否提供足够的局部性来实现具有竞争力的医学图像分割？
RQ3在准确性、参数、FLOPs 和 CPU 推理时间方面，UNeXt 与 UNet、UNet++、ResUNet、MedT 和 TransUNet 的比较如何？

主要发现

Networks	Params (M)	Inference Speed (ms)	GFLOPs	ISIC F1	ISIC IoU	BUSI F1	BUSI IoU
UNet	31.13	223	55.84	84.03 ± 0.87	74.55 ± 0.96	76.35 ± 0.89	63.85 ± 1.12
UNet++	9.16	173	34.65	84.96 ± 0.71	75.12 ± 0.65	77.54 ± 0.74	64.33 ± 0.75
ResUNet	62.74	333	94.56	85.60 ± 0.68	75.62 ± 1.11	78.25 ± 0.74	64.89 ± 0.83
MedT	1.60	751	21.24	87.35 ± 0.18	79.54 ± 0.26	76.93 ± 0.11	63.89 ± 0.55
TransUNet	105.32	246	38.52	88.91 ± 0.63	80.51 ± 0.72	79.30 ± 0.37	66.92 ± 0.75
UNeXt	1.47	25	0.57	89.70 ± 0.96	81.70 ± 1.53	79.37 ± 0.57	66.95 ± 1.22

UNeXt 在 ISIC 和 BUSI 数据集上达到有竞争力甚至优于的 F1 和 IoU 分数。
UNeXt 使用 1.47M 参数和 0.57 GFLOPs，远少于 TransUNet（105.32M, 38.52 GFLOPs）。
UNeXt 在 ISIC 上获得 89.70 F1 和 81.70 IoU，在 BUSI 上为 79.37 IoU，CPU 推理时间为 25 ms。
在基线中，UNeXt 提供最佳的精度与效率平衡，在计算和参数数量方面优于基于注意力的模型。
消融研究表明移动 Tok-MLP 与结合卷积和 MLP 阶段可在最小的复杂度增长下实现最佳性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。