Summary

Medical image biomarkers of cancer promise improvements in patient care through advances in precision medicine. Compared to genomic biomarkers, image biomarkers provide the advantages of being a non-invasive procedure, and characterizing a heterogeneous tumor in its entirety, as opposed to limited tissue available for biopsy. We developed a unique radiogenomic dataset from a Non-Small Cell Lung Cancer (NSCLC) cohort of 211 subjects. The dataset comprises Computed Tomography (CT), Positron Emission Tomography (PET)/CT images, semantic annotations of the tumors as observed on the medical images using a controlled vocabulary, segmentation maps of tumors in the CT scans, and quantitative values obtained from the PET/CT scans. Imaging data are also paired with gene mutation, RNA sequencing data from samples of surgically excised tumor tissue, and clinical data, including survival outcomes. This dataset was created to facilitate the discovery of the underlying relationship between genomic and medical image features, as well as the development and evaluation of prognostic medical image biomarkers.

Further details regarding this data-set may be found in Bakr, et. al, Sci Data. 2018 Oct 16;5:180202. doi: 10.1038/sdata.2018.202, https://www.ncbi.nlm.nih.gov/pubmed/30325352.

For additional inquiries contact the TCIA Helpdesk.


Data Access

Click the  Download button to save a ".tcia" manifest file to your computer, which you must open with the NBIA Data Retriever . Click the Search button to open our Data Portal, where you can browse the data collection and/or download a subset of its contents.

Data TypeDownload all or Query/Filter
Images and Segmentations (DICOM, 97.6 GB)

AIM Annotations (XML, zip)

Clinical Data (csv)

RNA sequence data (web)

Note: 130 subject subset

Click the Versions tab for more info about data releases.

Third Party Analyses of this Dataset

TCIA encourages the community to publish your analyses of our datasets . Below is a list of such third party analyses published using this Collection:

Detailed Description

Collection Statistics


Modalities

CT, PT, SEG

Number of Participants

211

Number of Studies

303

Number of Series

1355

Number of Images

285,411

Image Size (GB)97.6

This collection was originally submitted to TCIA as a 26 subject pilot data set. You can learn more about that subset of the collection in the following Analysis Results publication:

Data Citation

Napel, Sandy, & Plevritis, Sylvia K. (2014). NSCLC Radiogenomics: Initial Stanford Study of 26 Cases. The Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2014.X7ONY6B1

Citations & Data Usage Policy 

Users of this data must abide by the TCIA Data Usage Policy and the Creative Commons Attribution 3.0 Unported License under which it has been published. Attribution should include references to the following citations:

Data Citation

Bakr, Shaimaa; Gevaert, Olivier; Echegaray, Sebastian; Ayers, Kelsey; Zhou, Mu; Shafiq, Majid; Zheng, Hong; Zhang, Weiruo; Leung, Ann; Kadoch, Michael; Shrager, Joseph; Quon, Andrew; Rubin, Daniel; Plevritis, Sylvia; Napel, Sandy.(2017). Data for NSCLC Radiogenomics Collection. The Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2017.7hs46erv

Publication Citation

Bakr, S., Gevaert, O., Echegaray, S., Ayers, K., Zhou, M., Shafiq, M., Zheng, H., Benson, J. A., Zhang, W., Leung, A., Kadoch, M., Hoang, C. D., Shrager, J., Quon, A., Rubin, D. L., Plevritis, S. K., & Napel, S. (2018). A radiogenomic dataset of non-small cell lung cancer. Scientific data, 5, 180202. https://doi.org/10.1038/sdata.2018.202

Publication Citation

Gevaert, O., Xu, J., Hoang, C. D., Leung, A. N., Xu, Y., Quon, A., … Plevritis, S. K. (2012, August). Non–Small Cell Lung Cancer: Identifying Prognostic Imaging Biomarkers by Leveraging Public Gene Expression Microarray Data—Methods and Preliminary Results. Radiology. Radiological Society of North America (RSNA). http://doi.org/10.1148/radiol.12111607

TCIA Citation

Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. (paper)

Other Publications Using This Data

If you have a publication you'd like to add, please contact the TCIA Helpdesk.

  1. Aonpong, P., Iwamoto, Y., Wang, W., Lin, L., & Chen, Y.-W. (2020). Hand-Crafted and Deep Learning-Based Radiomics Models for Recurrence Prediction of Non-Small Cells Lung Cancers. Innovation in Medicine and Healthcare, 192, 135-144. doi:https://doi.org/10.1007/978-981-15-5852-8_13
  2. Aonpong, P., Iwamoto, Y., Wang, W., Lin, L., & Chen, Y.-W. (2021). Genomics-Based Models for Recurrence Prediction of Non-small Cells Lung Cancers. Paper presented at the KES International Conferences on Innovation in Medicine and Healthcare (KES-InMed-21), Online only. 
  3. Choi, J., Cho, H. H., Kwon, J., Lee, H. Y., & Park, H. (2021). A Cascaded Neural Network for Staging in Non-Small Cell Lung Cancer Using Pre-Treatment CT. Diagnostics (Basel), 11(6). doi:10.3390/diagnostics11061047 
  4. Großmann, P. B. H. J., & Grossmann, P. B. H. J. (2018). Defining the biological and clinical basis of radiomics: towards clinical imaging biomarkers. (PhD Ph.D. Thesis). Universitaire Pers Maastricht Maastricht. Retrieved from https://cris.maastrichtuniversity.nl/portal/files/24774850/c5959.pdf 
  5. Kadoya, N., Tanaka, S., Kajikawa, T., Tanabe, S., Abe, K., Nakajima, Y., . . . Jingu, K. (2020). Homology-based radiomic features for prediction of the prognosis of lung cancer based on CT-based radiomics. Med Phys. doi:10.1002/mp.14104
  6. Kalpathy-Cramer, J., Mamomov, A., Zhao, B., Lu, L., Cherezov, D., Napel, S., . . . Goldgof, D. (2016). Radiomics of Lung Nodules: A Multi-Institutional Study of Robustness and Agreement of Quantitative Imaging Features. Tomography: a journal for imaging research, 2(4), 430-437. doi:10.18383/j.tom.2016.00235
  7. Koyasu, S., Nishio, M., Isoda, H., Nakamoto, Y., & Togashi, K. (2020). Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on (18)F FDG-PET/CT. Ann Nucl Med, 34(1), 49-57. doi:https://doi.org/10.1007/s12149-019-01414-0
  8. Leitner, B. P., & Perry, R. J. (2020). The Impact of Obesity on Tumor Glucose Uptake in Breast and Lung Cancer. JNCI Cancer Spectrum. doi:10.1093/jncics/pkaa007
  9. Mattonen, S. A., Davidzon, G. A., Benson, J., Leung, A. N. C., Vasanawala, M., Horng, G., . . . Nair, V. S. (2019). Bone Marrow and Tumor Radiomics at (18)F-FDG PET/CT: Impact on Outcome Prediction in Non-Small Cell Lung Cancer. Radiology, 190357. doi:10.1148/radiol.2019190357
  10. Mienye, I. D. (2021). Improved Machine Learning Algorithms with Application to Medical Diagnosis. (Ph. D. Dissertation
    Ph.D.). University of Johannesburg, Retrieved from https://www.researchgate.net/profile/Domor-Mienye/publication/350788174_Improved_Machine_Learning_Algorithms_with_Application_to_Medical_Diagnosis/links/6071d56092851c8a7bba864f/Improved-Machine-Learning-Algorithms-with-Application-to-Medical-Diagnosis.pdf Available from TCIA 10.7937/K9/TCIA.2017.7hs46erv database. 
  11. Moitra, D., & Kr. Mandal, R. (2020). Classification of non-small cell lung cancer using one-dimensional convolutional neural network. Expert Systems with Applications, 159, 113564. doi:https://doi.org/10.1016/j.eswa.2020.113564
  12. Moitra, D., & Mandal, R. K. (2019). Automated AJCC (7th edition) staging of non-small cell lung cancer (NSCLC) using deep convolutional neural network (CNN) and recurrent neural network (RNN). Health Inf Sci Syst, 7(1), 14. doi:10.1007/s13755-019-0077-1
  13. Moitra, D., & Mandal, R. K. (2019). Automated grading of non-small cell lung cancer by fuzzy rough nearest neighbour method. Network Modeling Analysis in Health Informatics and Bioinformatics, 8(1). doi:10.1007/s13721-019-0204-6
  14. Moitra, D., & Mandal, R. K. (2020). Prediction of Non-small Cell Lung Cancer Histology by a Deep Ensemble of Convolutional and Bidirectional Recurrent Neural Network. Journal of Digital Imaging, 1-8. doi:10.1007/s10278-020-00337-x
  15. Morgado, J., Pereira, T., Silva, F., Freitas, C., Negrão, E., de Lima, B. F., . . . Oliveira, H. P. (2021). Machine Learning and Feature Selection Methods for EGFR Mutation Status Prediction in Lung Cancer. Applied Sciences, 11(7), 3273. doi:10.3390/app11073273
  16. Mukherjee, P., Zhou, M., Lee, E., Schicht, A., Balagurunathan, Y., Napel, S., . . . Gevaert, O. (2020). A shallow convolutional neural network predicts prognosis of lung cancer patients in multi-institutional computed tomography image datasets. Nature Machine Intelligence, 2, 274-282. doi:https://doi.org/10.1038/s42256-020-0173-6
  17. Nishio, M., Nishizawa, M., Sugiyama, O., Kojima, R., Yakami, M., Kuroda, T., & Togashi, K. (2018). Computer-aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization. PLoS One, 13(4), e0195875. doi:10.1371/journal.pone.0195875
  18. Saad, M., & Choi, T.-S. (2018). Computer-assisted subtyping and prognosis for non-small cell lung cancer patients with unresectable tumor. Computerized Medical Imaging and Graphics, 67, 1-8. doi:10.1016/j.compmedimag.2018.04.003
  19. Thomas, R., Schalck, E., Fourure, D., Bonnefoy, A., & Cervera-Marzal, I. (2021). 2Be3-Net: Combining 2D and 3D Convolutional Neural Networks for 3D PET Scans Predictions. Paper presented at the International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD 2021). 
  20. Torres, F. S., Akbar, S., Raman, S., Yasufuku, K., Schmidt, C., Hosny, A., . . . Leighl, N. B. (2021). End-to-End Non-Small-Cell Lung Cancer Prognostication Using Deep Learning Applied to Pretreatment Computed Tomography. JCO Clin Cancer Inform, 5, 1141-1150. doi:10.1200/cci.21.00096
  21. Trebeschi, S., Bodalal, Z., van Dijk, N., Boellaard, T. N., Apfaltrer, P., Tareco Bucho, T. M., . . . Beets-Tan, R. G. H. (2021). Development of a Prognostic AI-Monitor for Metastatic Urothelial Cancer Patients Receiving Immunotherapy. Front Oncol, 11, 637804. doi:10.3389/fonc.2021.637804
  22. Yousefi, B., Jahani, N., LaRiviere, M. J., Cohen, E., Hsieh, M.-K., Luna, J. M., . . . Kontos, D. (2019). Correlative hierarchical clustering-based low-rank dimensionality reduction of radiomics-driven phenotype in non-small cell lung cancer. Paper presented at the SPIE Medical Imaging, San Diego, California, United States. 

Version 4: Updated 2021/06/01

Data Type

Download all or Query/Filter

Images (DICOM, 97.6 GB)

 

(Requires the NBIA Data Retriever .)

AIM Annotations (XML, zip)

Clinical Data (csv)

  • Added missing image studies for the following cases: R01-009 (CT), R01-100 (PET/CT), and R01-111 (PET/CT).
  • SUV conversion factor DICOM tag (7053,1000) was added for the following Philips PET images: R01-074, R01-077, R01-079, R01-089, R01-98 and R01-137.

Version 3: Updated 2020/11/10

Data Type

Download all or Query/Filter

Images (DICOM, 97.6 GB)

 

(Requires the NBIA Data Retriever .)

AIM Annotations (XML, zip)

Clinical Data (csv)

  • A new version of RO1-023 was created to correct a cranial-caudal flip of the segmentation of the CT volume (483 images) and associated Segmentation object. The UIDs of the other scans were updated to preserve Study level consistency but were otherwise unmodified. The referenced UIDs within the AIM object for RO1-023 were updated and renamed to RO1-023v1.
  • RO1-038 was updated to remove a coronal slice at the start of the of the CT volume. This created difficulty for some software to determine slice spacing.

Version 2 (Current): Updated 2017/02/28

Data Type

Download all or Query/Filter

Images (DICOM, 97.6 GB)

 

(Requires the NBIA Data Retriever.)

AIM Annotations (XML, zip)

Clinical Data (csv)

Version 1: Updated 2015/12/22

Data Type

This collection was originally submitted to TCIA as a 26 subject pilot data set. You can learn more about that subset of the collection in the following Analysis Results publication:

NSCLC Radiogenomics: Initial Stanford Study of 26 Cases


  • No labels