[Paper Review] COVID-19: A Survey on Public Medical Imaging Data Resources
This survey compiles and categorizes publicly available medical imaging datasets for COVID-19, including CT, X-ray, and ultrasound scans with associated metadata. It supports AI and machine learning research by aggregating resources to accelerate diagnosis and model development, with a focus on improving early detection and classification accuracy using open data sources.
This regularly updated survey provides an overview of public resources that offer medical images and metadata of COVID-19 cases. The purpose of this survey is to simplify the access to open COVID-19 image data resources for all scientists currently working on the coronavirus crisis.
Motivation & Objective
- To consolidate and document publicly accessible medical imaging datasets for COVID-19 cases to support AI-driven research.
- To address the scarcity of large, open, and well-annotated medical image data for training and validating machine learning models.
- To facilitate rapid access to imaging resources for researchers working on early detection and classification of COVID-19.
- To encourage data sharing and transparency by promoting open access to imaging data under ethical and privacy-compliant conditions.
- To serve as a regularly updated reference for scientists seeking reliable, publicly available imaging data during the pandemic.
Proposed method
- Systematic collection and classification of publicly available medical imaging datasets related to COVID-19 from diverse institutions and platforms.
- Categorization of datasets based on imaging modality (CT, X-ray, MRT, ultrasound), presence of metadata, and case review availability.
- Use of standardized labels (Y = yes, N = no, U = unknown) to indicate data availability across 13 key datasets.
- Inclusion of resources from major institutions such as the Allen Institute for AI, University of California San Diego, and European radiology societies.
- Regular updates to the survey to reflect new data releases and community contributions.
- Encouragement of community submissions and error reporting to maintain data accuracy and completeness.
Experimental results
Research questions
- RQ1What publicly available medical imaging datasets for COVID-19 are currently accessible to researchers?
- RQ2Which imaging modalities (e.g., CT, X-ray, ultrasound) are most represented in existing public datasets?
- RQ3How comprehensive are the metadata and case-level annotations across different public data resources?
- RQ4What is the current state of open data availability for AI and machine learning applications in COVID-19 diagnosis?
- RQ5How can researchers efficiently discover and access reliable, open-source medical imaging data for pandemic-related research?
Key findings
- The survey identifies 13 public medical imaging resources for COVID-19, including datasets from institutions such as the Allen Institute for AI, University of California San Diego, and European radiology platforms.
- CT and X-ray imaging are the most commonly available modalities, with 10 out of 13 datasets providing CT or X-ray scans.
- Metadata is available in 11 out of 13 datasets, indicating strong support for data provenance and case-level information.
- Case reviews are present in 5 out of 13 datasets, suggesting limited but growing availability of clinical annotations.
- The SIRM COVID-19 Database and eurorad.org are among the largest repositories, offering CT, X-ray, and metadata with case-level reviews.
- The survey highlights the importance of open data access and encourages researchers to contribute and cite the resource to improve data availability and research reproducibility.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.