The Cancer Imaging Archive (TCIA) has procured substantial troves of image data, which could serve as valuable training sets for improving machine learning algorithms. However, these datasets lack consistent lesion annotations. To address this issue, the Cancer Imaging Informatics Lab at the Frederick National Laboratory for Cancer Research (FNLCR) formed a partnership with five groups funded by the National Cancer Institute's Informatics Technology for Cancer Research program to develop a web-based crowdsourcing application for gathering lesion annotations, featured at the annual meeting of the Radiological Society of North American (RSNA).
Crowds Cure Cancer (https://www.crowds-cure.org) first exhibited at RSNA 2017 utilizing CT scans from 4 different TCIA collections. Participants were asked to make a uni-dimensional measurement of the largest lesion. There were no options to provide details regarding imaging quality (e.g., no IV contrast, motion artifact, etc.), lesion location (e.g., lung, liver, etc.) or lesion characteristics (e.g., ill-defined, ground glass, etc.), requiring additional post-collection image review. The 2017 dataset can be found at https://doi.org/10.7937/K9/TCIA.2018.OW73VLO2.
For RSNA 2018, the application was re-designed to promote more comprehensive data collection and increased community participation. Participants were instructed to identify all metastatic disease and provide details regarding image quality, lesion location and characteristics. To provide additional incentives for participation, we improved the system by adding gamification features (e.g., reward badges), and created a leaderboard to display participant standings. The amount of data being annotated was also significantly increased to include CT scans from 324 patients spanning 13 TCIA collections: Anti-PD-1_Lung, Anti-PD-1_MELANOMA, CPTAC-CCRCC, CPTAC-GBM, CPTAC-HNSCC, CPTAC-PDA, CPTAC-UCEC, NSCLC Radiogenomics, TCGA-BLCA, TCGA-COAD, TCGA-HNSC, TCGA-LUSC, TCGA-UCEC. During RSNA 2018, 4756 bi-directional measurements were obtained compared to 2345 uni-dimensional measurements in 2017. Of the 4756 measurements, 65% of the lesions were annotated with location information. The data is being released in DICOM Structured Report and CSV formats for analysis by the community. The application is available on GitHub https://github.com/crowds-cure/cancer.
Click the Download button to save a ".tcia" manifest file to your computer, which you must open with the NBIA Data Retriever
|Data Type||Download all or Query/Filter|
|Images annotated by participants (DICOM)|
|Crowd measurements (CSV)|
|Structured reports (DICOM-SR)|
Note: Please contact email@example.com with any questions regarding usage.
Data resulting from this experiment is available in the following formats:
- Source DICOM scans annotated by participants
- CSV representation of crowd measurements
- DICOM-SR representation of crowd measurements
Citations & Data Usage Policy
These collections are freely available to browse, download, and use for commercial, scientific and educational purposes as outlined in the Creative Commons Attribution 3.0 Unported License. Questions may be directed to firstname.lastname@example.org. Please be sure to acknowledge both this data set and TCIA in publications by including the following citations in your work:
Trinity Urban, Erik Ziegler, Steve Pieper, Justin Kirby, Daniel Rukas, Britney Beardmore, Bhanusupriya Somarouthu, Evren Ozkan, Gustavo Lelis, Brenda Fevrier-Sullivan, Samarth Nandekar, Andrew Beers, Carl Jaffe, John Freymann, David Clunie, Gordon J. Harris, Jayashree Kalpathy-Cramer. Crowds Cure Cancer: Data collected at the RSNA 2018 annual meeting. The Cancer Imaging Archive. doi: 10.7937/TCIA.2019.yk0gm1eb
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. (paper)