QUICK REVIEW

[Paper Review] An automatic COVID-19 CT segmentation based on U-Net with attention mechanism

Tongxue Zhou, Stéphane Canu|arXiv (Cornell University)|Apr 14, 2020

COVID-19 diagnosis using AI54 citations

TL;DR

This paper proposes an attention-augmented U-Net with focal Tversky loss for automatic COVID-19 lung lesion segmentation in CT scans. By integrating spatial and channel attention mechanisms to refine feature representations and using a loss function tailored for small lesions, the method achieves high accuracy (Dice: 83.1%) and speed (0.29 s per slice), demonstrating strong performance on a 473-slice dataset.

ABSTRACT

The coronavirus disease (COVID-19) pandemic has led to a devastating effect on the global public health. Computed Tomography (CT) is an effective tool in the screening of COVID-19. It is of great importance to rapidly and accurately segment COVID-19 from CT to help diagnostic and patient monitoring. In this paper, we propose a U-Net based segmentation network using attention mechanism. As not all the features extracted from the encoders are useful for segmentation, we propose to incorporate an attention mechanism including a spatial and a channel attention, to a U-Net architecture to re-weight the feature representation spatially and channel-wise to capture rich contextual relationships for better feature representation. In addition, the focal tversky loss is introduced to deal with small lesion segmentation. The experiment results, evaluated on a COVID-19 CT segmentation dataset where 473 CT slices are available, demonstrate the proposed method can achieve an accurate and rapid segmentation on COVID-19 segmentation. The method takes only 0.29 second to segment a single CT slice. The obtained Dice Score, Sensitivity and Specificity are 83.1%, 86.7% and 99.3%, respectively.

Motivation & Objective

Address the challenge of accurate and rapid segmentation of COVID-19 lung lesions in CT scans to support diagnosis and patient monitoring.
Improve feature representation in U-Net by selectively emphasizing relevant spatial and channel-wise features using attention mechanisms.
Overcome the class imbalance issue in small lesion segmentation by employing a focal Tversky loss function.
Achieve high segmentation performance with minimal inference time, suitable for clinical deployment.

Proposed method

Integrate a dual attention mechanism—spatial and channel attention—into the U-Net encoder-decoder architecture to re-weight feature maps based on contextual importance.
Apply spatial attention to emphasize informative spatial regions and channel attention to highlight discriminative feature channels.
Use the focal Tversky loss to focus training on hard-to-segment regions, especially small lesions, by down-weighting easy negatives.
Train the network end-to-end on a dataset of 473 CT slices with ground-truth annotations for lung lesions.
Leverage skip connections from encoder to decoder to preserve spatial details during up-sampling.
Optimize the model using stochastic gradient descent with a learning rate schedule to improve convergence.

Experimental results

Research questions

RQ1Can attention mechanisms improve feature representation in U-Net for more accurate COVID-19 lesion segmentation in CT scans?
RQ2Does the focal Tversky loss enhance segmentation performance on small and sparse lung lesions compared to standard loss functions?
RQ3Can the proposed method achieve both high accuracy and low inference time for real-time clinical use?
RQ4How does the combination of attention and focal Tversky loss compare to standard U-Net in terms of Dice score, sensitivity, and specificity?

Key findings

The proposed method achieved a Dice score of 83.1% on the test set, indicating strong overlap between predicted and ground-truth lesions.
Sensitivity reached 86.7%, showing the model effectively detects most actual lesions despite their small size.
Specificity was 99.3%, indicating very few false positives, which is crucial for clinical reliability.
The model segmented a single CT slice in just 0.29 seconds, demonstrating high inference speed suitable for clinical deployment.
The integration of spatial and channel attention improved feature representation by focusing on relevant regions and channels.
The focal Tversky loss significantly improved performance on small lesions by reducing the impact of easy negatives during training.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.