The Cancer Imaging Archive (TCIA) has procured substantial troves of image data, which could serve as valuable training sets for improving machine learning algorithms. However, these datasets lack consistent lesion annotations. To address this issue, the Cancer Imaging Informatics Lab at the Frederick National Laboratory for Cancer Research (FNLCR) formed a partnership with five groups funded by the National Cancer Institute's Informatics Technology for Cancer Research program to develop a web-based crowdsourcing application for gathering lesion annotations, featured at the annual meeting of the Radiological Society of North American (RSNA).
Crowds Cure Cancer (https://www.crowds-cure.org) first exhibited at RSNA 2017 utilizing CT scans from 4 different TCIA collections. Participants were asked to make a uni-dimensional measurement of the largest lesion. There were no options to provide details regarding imaging quality (e.g., no IV contrast, motion artifact, etc.), lesion location (e.g., lung, liver, etc.) or lesion characteristics (e.g., ill-defined, ground glass, etc.), requiring additional post-collection image review. The 2017 dataset can be found at https://doi.org/10.7937/K9/TCIA.2018.OW73VLO2.
For RSNA 2018, the application was re-designed to promote more comprehensive data collection and increased community participation. Participants were instructed to identify all metastatic disease and provide details regarding image quality, lesion location and characteristics. To provide additional incentives for participation, we improved the system by adding gamification features (e.g., reward badges), and created a leaderboard to display participant standings. The amount of data being annotated was also significantly increased to include CT scans from 324 patients spanning 13 TCIA collections: Anti-PD-1_Lung, Anti-PD-1_MELANOMA, The Clinical Proteomic Tumor Analysis Consortium Clear Cell Renal Cell Carcinoma Collection (CPTAC-CCRCC), The Clinical Proteomic Tumor Analysis Consortium Glioblastoma Multiforme Collection (CPTAC-GBM), The Clinical Proteomic Tumor Analysis Consortium Head and Neck Squamous Cell Carcinoma Collection (CPTAC-HNSCC), The Clinical Proteomic Tumor Analysis Consortium Pancreatic Ductal Adenocarcinoma Collection (CPTAC-PDA), The Clinical Proteomic Tumor Analysis Consortium Uterine Corpus Endometrial Carcinoma Collection (CPTAC-UCEC), NSCLC Radiogenomics, The Cancer Genome Atlas Urothelial Bladder Carcinoma Collection (TCGA-BLCA), The Cancer Genome Atlas Colon Adenocarcinoma Collection (TCGA-COAD), The Cancer Genome Atlas Head-Neck Squamous Cell Carcinoma Collection (TCGA-HNSC), The Cancer Genome Atlas Lung Squamous Cell Carcinoma Collection (TCGA-LUSC), The Cancer Genome Atlas Uterine Corpus Endometrial Carcinoma Collection (TCGA-UCEC). During RSNA 2018, 4756 bi-directional measurements were obtained compared to 2345 uni-dimensional measurements in 2017. Of the 4756 measurements, 65% of the lesions were annotated with location information. The data is being released in DICOM Structured Report and CSV formats for analysis by the community. The application is available on GitHub https://github.com/crowds-cure/cancer .
Some data in this collection contains images that could potentially be used to reconstruct a human face. To safeguard the privacy of participants, users must sign and submit a TCIA Restricted License Agreement to firstname.lastname@example.org before accessing the data.
|Data Type||Download all or Query/Filter||License|
|Crowd measurements (CSV)|
|Structured reports (DICOM-SR)|
Collections Used in this Third Party Analysis
Below is a list of the Collections used in these analyses. Download all manifests for the full dataset:
|Source Data Type||Download||License|
Limited-access images used for annotation (DICOM)
Publicly accessible images used for annotation (DICOM)
Data resulting from this experiment is available in the following formats:
- Source DICOM scans annotated by participants
- CSV representation of crowd measurements
- DICOM-SR representation of crowd measurements
Citations & Data Usage Policy
Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution should include references to the following citations:
Urban, T., Ziegler, E., Pieper, S., Kirby, J., Rukas, D., Beardmore, B., Somarouthu, B., Ozkan, E., Lelis, G., Fevrier-Sullivan, B., Nandekar, S., Beers, A., Jaffe, C., Freymann, J., Clunie, D., Harris, G. J., & Kalpathy-Cramer, J. (2019). Crowds Cure Cancer: Crowdsourced data collected at the RSNA 2018 annual meeting [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2019.yk0gm1eb
Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal of Digital Imaging, 26(6), 1045–1057. https://doi.org/10.1007/s10278-013-9622-7