Summary
Background
The COVID-19 pandemic is a global healthcare emergency. Prediction models for COVID-19 imaging are rapidly being developed to support medical decision making in imaging. However, inadequate availability of a diverse annotated dataset has limited the performance and generalizability of existing models.
Purpose
To create the first multi-institutional, multi-national expert annotated COVID-19 imaging dataset made freely available to the machine learning community as a research and educational resource for COVID-19 chest imaging. The Radiological Society of North America (RSNA) assembled the RSNA International COVID-19 Open Radiology Database (RICORD) collection of COVID-related imaging datasets and expert annotations to support research and education. RICORD data will be incorporated in the Medical Imaging and Data Resource Center (MIDRC), a multi-institutional research data repository funded by the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health.
Materials and Methods
This dataset was a collaboration between the RSNA and Society of Thoracic Radiology (STR).
Results
The RSNA International COVID-19 Open Annotated Radiology Database (RICORD) release 1b consists of 120 thoracic computed tomography (CT) scans of COVID negative patients from four international sites.
Patient Selection: Patients at least 18 years in age receiving negative diagnosis for COVID-19.
Data Abstract
120 de-identified Thoracic CT scans from COVID negative patients.
Supporting clinical variables: MRN*, Age, Exam Date/Time*, Exam Description, Sex, Study UID*, Image Count, Modality, Symptomatic, Testing Result, Specimen Source (* pseudonymous values).
Research Benefits
As this is a public dataset, RICORD is available for non-commercial use (and further enrichment) by the research and education communities which may include development of educational resources for COVID-19, use of RICORD to create AI systems for diagnosis and quantification, benchmarking performance for existing solutions, exploration of distributed/federated learning, further annotation or data augmentation efforts, and evaluation of the examinations for disease entities beyond COVID-19 pneumonia. Deliberate consideration of the detailed annotation schema, demographics, and other included meta-data will be critical when generating cohorts with RICORD, particularly as more public COVID-19 imaging datasets are made available via complementary and parallel efforts. It is important to emphasize that there are limitations to the clinical “ground truth” as the SARS-CoV-2 RT-PCR tests have widely documented limitations and are subject to both false-negative and false-positive results which impact the distribution of the included imaging data, and may have led to an unknown epidemiologic distortion of patients based on the inclusion criteria. These limitations notwithstanding, RICORD has achieved the stated objectives for data complexity, heterogeneity, and high-quality expert annotations as a comprehensive COVID-19 thoracic imaging data resource.
Acknowledgements
We would like to acknowledge the individuals and institutions that have provided data for this collection: Data in RICORD will be made available through the Medical Imagining Data Resource Center, funded through a contract with the National Institute for Biomedical Imaging and Bioengineering (NIBIB).
Data Access
Data Type | Download all or Query/Filter | License |
---|---|---|
Images (DICOM, 8 GB) | (Download requires the NBIA Data Retriever) | |
Clinical data (.csv, 25 kB) | CC BY-NC 4.0 |
Click the Versions tab for more info about data releases.
Please contact help@cancerimagingarchive.net with any questions regarding usage.
Additional Resources for this Dataset
The NCI Cancer Research Data Commons (CRDC) provides access to additional data and a cloud-based data science infrastructure that connects data sets with analytics tools to allow users to share, integrate, analyze, and visualize cancer research data.
- Imaging Data Commons (IDC) (Imaging Data)
Detailed Description
Image Statistics | |
---|---|
Modalities | CT |
Number of Patients | 117 |
Number of Studies | 120 |
Number of Series | 120 |
Number of Images | 21220 |
Images Size (GB) | 8 |
Citations & Data Usage Policy
Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution should include references to the following citations:
Data Citation
Tsai, E. B., Simpson, S., Lungren, M. P., Hershman, M., Roshkovan, L., Colak, E., Erickson, B. J., Shih, G., Stein, A., Kalpathy-Cramer, J., Shen, J., Hafez, M. A. F., John, S., Rajiah, P., Pogatchnik, B. P., Mongan, J. T., Altinmakas, E., Ranschaert, E., Kitamura, F. C., … Wu, C. (2021). Medical Imaging Data Resource Center (MIDRC) - RSNA International COVID-19 Open Radiology Database (RICORD) Release 1b - Chest CT Covid- (MIDRC-RICORD-1B) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/31V8-4A40
Publication Citation
Tsai, E. B., Simpson, S., Lungren, M., Hershman, M., Roshkovan, L., Colak, E., Erickson, B. J., Shih, G., Stein, A., Kalpathy-Cramer, J., Shen, J., Hafez, M., John, S., Rajiah, P., Pogatchnik, B. P., Mongan, J., Altinmakas, E., Ranschaert, E. R., Kitamura, F. C., … Wu, C. C. (2021). The RSNA International COVID-19 Open Annotated Radiology Database (RICORD). Radiology, 203957. https://doi.org/10.1148/radiol.2021203957
TCIA Citation
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: 10.1007/s10278-013-9622-7
Other Publications Using This Data
TCIA maintains a list of publications which leverage TCIA data. If you have a manuscript you'd like to add please contact the TCIA Helpdesk.
Version 1 (Current): Updated 2021/02/05
Data Type | Download all or Query/Filter |
---|---|
Images (DICOM, 8 GB) | (Download requires the NBIA Data Retriever) |
Clinical Data (CSV) |