Skip to main content
QUICK REVIEW

[Paper Review] Vision Transformers in Medical Imaging: A Review

Emerald U. Henry, Onyeka Emebob|arXiv (Cornell University)|Nov 18, 2022
Brain Tumor Detection and Classification34 citations
TL;DR

A comprehensive review of how vision transformers are applied in medical imaging, comparing transformer-based methods with CNNs across classification, segmentation, registration, and reconstruction.

ABSTRACT

Transformer, a model comprising attention-based encoder-decoder architecture, have gained prevalence in the field of natural language processing (NLP) and recently influenced the computer vision (CV) space. The similarities between computer vision and medical imaging, reviewed the question among researchers if the impact of transformers on computer vision be translated to medical imaging? In this paper, we attempt to provide a comprehensive and recent review on the application of transformers in medical imaging by; describing the transformer model comparing it with a diversity of convolutional neural networks (CNNs), detailing the transformer based approaches for medical image classification, segmentation, registration and reconstruction with a focus on the image modality, comparing the performance of state-of-the-art transformer architectures to best performing CNNs on standard medical datasets.

Motivation & Objective

  • Assess how transformer models are adapted for medical image analysis.
  • Compare transformer-based approaches with CNN baselines on standard datasets.
  • Detail transformer applications across classification, segmentation, registration, and reconstruction.
  • Highlight image modalities and datasets used for evaluation.

Proposed method

  • Describe transformer architectures and contrast with CNNs in medical imaging contexts.
  • Summarize transformer-based methods for classification, segmentation, registration, and reconstruction.
  • Review image modalities (e.g., MRI, CT, etc.) and standard medical datasets used for benchmarking.
  • Compare performance of state-of-the-art transformers to best CNNs on standard datasets.

Experimental results

Research questions

  • RQ1How do vision transformers perform relative to CNNs on standard medical imaging tasks?
  • RQ2Which medical imaging modalities and datasets are most commonly used to evaluate vision transformers?
  • RQ3What are the strengths and limitations of transformer-based approaches for classification, segmentation, registration, and reconstruction in medical imaging?

Key findings

  • Transformers are evaluated across classification, segmentation, registration, and reconstruction in medical imaging.
  • State-of-the-art transformer architectures are compared to CNNs on standard datasets.
  • The review highlights modalities and datasets used for benchmarking transformer performance in medical imaging.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.