[Paper Review] UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation
Introduces UltraLight VM-UNet, a highly parameter-efficient skin lesion segmentation model based on Parallel Vision Mamba (PVM Layer), achieving 0.049M parameters with competitive performance on public datasets.
Traditionally for improving the segmentation performance of models, most approaches prefer to use adding more complex modules. And this is not suitable for the medical field, especially for mobile medical devices, where computationally loaded models are not suitable for real clinical environments due to computational resource constraints. Recently, state-space models (SSMs), represented by Mamba, have become a strong competitor to traditional CNNs and Transformers. In this paper, we deeply explore the key elements of parameter influence in Mamba and propose an UltraLight Vision Mamba UNet (UltraLight VM-UNet) based on this. Specifically, we propose a method for processing features in parallel Vision Mamba, named PVM Layer, which achieves excellent performance with the lowest computational load while keeping the overall number of processing channels constant. We conducted comparisons and ablation experiments with several state-of-the-art lightweight models on three skin lesion public datasets and demonstrated that the UltraLight VM-UNet exhibits the same strong performance competitiveness with parameters of only 0.049M and GFLOPs of 0.060. In addition, this study deeply explores the key elements of parameter influence in Mamba, which will lay a theoretical foundation for Mamba to possibly become a new mainstream module for lightweighting in the future. The code is available from https://github.com/wurenkai/UltraLight-VM-UNet .
Motivation & Objective
- Motivate lightweight medical image segmentation for mobile/clinical use where computational resources are limited.
- Investigate how parameter reduction in Mamba affects performance in vision tasks.
- Develop a parallel processing strategy (PVM Layer) to control parameter growth while preserving accuracy.
- Demonstrate the effectiveness of UltraLight VM-UNet on ISIC 2017, ISIC 2018, and PH2 datasets.
Proposed method
- Propose Parallel Vision Mamba Layer (PVM Layer) that splits input channels into four equal parts processed in parallel by VSS Blocks.
- Use Vision Mamba based core (SS2D, S4D, and related projections) with careful channel-count control to minimize parameters.
- Integrate a U-Net style encoder-decoder with skip connections using Channel and Spatial Attention Bridges for multiscale feature fusion.
- Conduct ablations to analyze how channel count and parallel VSS Blocks affect parameters and performance.
- Evaluate on three public skin lesion datasets with standard augmentation and BCE-Dice loss, reporting DSC, SE, SP, and ACC.
Experimental results
Research questions
- RQ1How does reducing the number of input channels in Mamba components affect parameter count and performance?
- RQ2Can a parallel processing scheme (PVM Layer) maintain or improve segmentation performance while drastically reducing parameters?
- RQ3What is the trade-off between parameter reduction and segmentation accuracy across ISIC2017, ISIC2018, and PH2 datasets?
- RQ4Do skip-connection fusion modules (CAB/SAB) contribute meaningfully to performance in the ultra-light setting?
Key findings
- UltraLight VM-UNet achieves 0.049M parameters and 0.060 GFLOPs while remaining competitive on three skin lesion datasets.
- The proposed PVM Layer reduces parameters by up to 93.1% in the VSS Block pathway, by distributing processing across four parallel blocks with quarter-channel inputs.
- UltraLight VM-UNet attains DSCs around 0.909–0.926 across ISIC2017, ISIC2018, and PH2 in reported results, with high ACC and robust SE/SP metrics.
- Ablation shows replacing the PVM Layer with standard convolutions increases parameters and degrades performance, confirming the importance of parallel Vision Mamba design.
- Compared to VM-UNet and LightM-UNet, UltraLight VM-UNet achieves 99.82% and 87.84% reductions in parameters, respectively, while preserving competitive performance.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.