Jayashree Kalpathy-Cramer, Andrew Beers, Artem Mamonov, Erik Ziegler, Rob Lewis, Andre Botelho Almeida, Gordon Harris, Steve Pieper, David Clunie, Ashish Sharma, Lawrence Tarbox, Jeff Tobler, Fred Prior, Adam Flanders, Jamie Dulkowski, Brenda Fevrier-Sullivan, Carl Jaffe, John Freymann, Justin Kirby. Crowds Cure Cancer: Data collected at the RSNA 2017 annual meeting. The Cancer Imaging Archive. doi: 10.7937/K9/TCIA.2018.OW73VLO2
Many Cancers routinely identified by imaging haven’t yet benefited from recent advances in computer science. Approaches such as machine learning and deep learning can generate quantitative tumor 3D volumes, complex features and therapy-tracking temporal dynamics. However, cross-disciplinary researchers striving to develop new approaches often lack disease understanding or sufficient contacts within the medical community. Their research can greatly benefit from labeling and annotating basic information in the images such as tumor locations, which are obvious to radiologists.
Crowd-sourcing the creation of publicly-accessible reference data sets could address this challenge. In 2011 the National Cancer Institute funded development of The Cancer Imaging Archive (TCIA), a free and open-access database of medical images. However, most of these collections lack the labeling and annotations needed by image processing researchers for progress in deep learning and radiomics. As a result, TCIA has partnered with the Radiological Society of North America (RSNA) and numerous academic centers to harness the vast knowledge of RSNA meeting attendees to generate these tumor markups. Data sets annotated included CT scans from TCGA-LUAD, TCGA-KIRC, TCGA-LIHC, and TCGA-OV.
A full explanation of the project can be seen in the booth posters:
Data resulting from this experiment is available in the following formats:
- Source DICOM scans annotated by participants
- RSNA2017CCC-doiJNLP-e8nBWDCC.jnlp (Download Manager software requires Java to launch)
- DICOM metadata and X/Y/Z measurement coordinates
- DICOM-SR representation of crowd measurements
- TCGA Clinical Data
- Note: Because all subjects were pulled from The Cancer Genome Atlas cohorts clinical data was available through the NCI Genomic Data Commons. A CSV dump of that data is provided here for convenience.