Summary

This collection contains subjects from the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium Lung Squamous Cell Carcinoma (CPTAC-LSCC) cohort. CPTAC is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics. Radiology and pathology images from CPTAC patients are being collected and made publicly available by The Cancer Imaging Archive to enable researchers to investigate cancer phenotypes which may correlate to corresponding proteomic, genomic and clinical data.

Imaging from each cancer type will be contained in its own TCIA Collection, with the collection name "CPTAC-cancertype".  Radiology imaging is collected from standard of care imaging performed on patients immediately before the pathological diagnosis, and from follow-up scans where available.  For this reason the radiology image data sets are heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. Pathology imaging is collected as part of the CPTAC qualification workflow.  

All CPTAC cohorts are released as either a single combined cohort, or split into Discovery and Confirmatory where applicable.  There are two main types of proteomic studies: discovery proteomics and targeted proteomics. The term "discovery proteomics" is in reference to "untargeted" identification and quantification of a maximal number of proteins in a biological or clinical sample. The term “targeted proteomics” refers to quantitative measurements on a defined subset of total proteins in a biological or clinical sample, often following the completion of discovery proteomics studies to confirm interesting targets selected. Commonly used proteomic technologies and platforms are different types of mass spectrometry and protein microarrays depending on the needs, throughput and sample input requirement of an analysis, with further development on nanotechnologies and automation in the pipeline in order to improve the detection of low abundance proteins, increase throughput, and selectively reach a target protein in vivo.  Once the protein targets of interest are identified, high-throughput targeted assays are developed for confirmatory studies: tests to affirm that the initial tests were accurate. A summary of CPTAC imaging efforts can be found on the CPTAC Imaging Proteomics page. 

CPTAC Imaging Special Interest Group

You can join the CPTAC Imaging Special Interest Group to be notified of webinars & data releases, collaborate on common data wrangling tasks and seek out partners to explore research hypotheses!  Artifacts from previous webinars such as slide decks and video recordings can be found on the CPTAC SIG Webinars page.


Acknowledgements

We would like to acknowledge the individuals and institutions that have provided data for this collection:




Data Access

Data TypeDownload all or Query/FilterLicense
Images (DICOM, 30.6 GB)






(Download requires the NBIA Data Retriever)

Tissue Slide Images (SVS, 414 GB)






(Download and apply the IBM-Aspera-Connect plugin to your browser to retrieve this faspex package) 

Click the Versions tab for more info about data releases.




Detailed Description



Radiology Image Statistics

Pathology Image Statistics

Modalities

CT, PT

Pathology

Number of Participants

36

212

Number of Studies

43

N/A

Number of Series

238

N/A

Number of Images

52,019

1,081
Images Size (GB)30.6 GB414

Accessing CPTAC publication cohorts

All CPTAC cohorts are released as either a single combined cohort, or split into Discovery and Confirmatory where applicableIn the case of CPTAC-LSCC there was a "Discovery Cohort" release.  Images associated with these cases can be downloaded using the following links:

Accessing the Proteomic & Genomic Clinical Data

To access/download the clinical data on the Proteomic Data Commons (PDC) and Genomic Data Commons (GDC)once you have identified the data of your interest, move to the 'Clinical' tab on the browse page. Select the checkbox to select a specific row, all rows on the page or all pages and click the export clinical manifest button in CSV or TSV format on the GDC, or TSV or JSON format on the PDC.

A Note about TCIA and CPTAC Subject Identifiers and Dates

Subject Identifiers: 

A subject with radiology and pathology images stored in TCIA is identified with a de-identified project Patient ID that is identical to the Patient ID of the same subject with clinical, proteomic, and/or genomic data stored in other CPTAC databases and web sites.

Dates: 

The radiology imaging data is in DICOM format. To provide temporal context information aligned with events in the clinical data set for each patient, TCIA has inserted information in DICOM tag (0012,0050) Clinical Trial Time Point ID. This DICOM tag contains the number of days from the date the patient was initially diagnosed pathologically with the disease to the date of the scan. E.g. a scan acquired 3 days before the diagnosis would contain the value -3. A follow up scan acquired 90 days after diagnosis would contain the value 90.

The DICOM date tags (i.e. birth dates, imaging study dates, etc.) are modified per TCIA's standard process which offsets them by a random number of days. The offset is a number of days between 3 and 10 years prior to the real date that is consistent for each TCIA image-submitting site and collection, but that varies among sites and among collections from the same site. Thus, the number of days between a subject’s longitudinal imaging studies are accurately preserved when more than one study has been archived while still meeting HIPAA requirements.




Citations & Data Usage Policy 


National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Lung Squamous Cell Carcinoma Collection (CPTAC-LSCC) (Version 14) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.6EMUB5L2


The CPTAC program requests that publications using data from this program include the following statement: “Data used in this publication were generated by the National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC).”


Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. In Journal of Digital Imaging (Vol. 26, Issue 6, pp. 1045–1057). Springer Science and Business Media LLC. https://doi.org/10.1007/s10278-013-9622-7

Other Publications Using This Data

TCIA maintains a list of publications which leverage TCIA data. If you have a manuscript you'd like to add please contact TCIA's Helpdesk.




Version 14 (Current): Updated 2023/02/24

Data TypeDownload all or Query/Filter
Images (DICOM, 30.6 GB)



  



(Download requires 
the NBIA Data Retriever)

Tissue Slide Images (SVS, 414 GB)






(Download and apply the IBM-Aspera-Connect plugin to your browser to retrieve this faspex package) 

Discovery Study (CPTAC Data Portal)

Radiology modality data cleanup to remove extraneous scans.

Version 13: Updated 2021/01/27

Data TypeDownload all or Query/Filter
Images (DICOM, 31 GB)






(Download requires the NBIA Data Retriever)

Tissue Slide Images (SVS, 414 GB)






Discovery Study (CPTAC Data Portal)

New Genomic and Proteomic data added

Version 12: Updated 2020/06/25

Data TypeDownload all or Query/Filter
Images (DICOM, 28.8 GB)
Tissue Slide Images (SVS, 414 GB)




Proteomics (web)




Added 12 participants' radiology imaging from two sites.

Version 11: Updated 2020/03/31


Data TypeDownload all or Query/Filter
Images (DICOM, 18.7 GB)




(Requires NBIA Data Retriever .)

Tissue Slide Images (SVS, 414 GB)




Proteomics (web)





Added 1 radiology subject

Version 10 : Updated 2020/02/14


Data TypeDownload all or Query/Filter
Images (DICOM, 18 GB)

 


 

(Requires NBIA Data Retriever .)

Tissue Slide Images (SVS, 414 GB)




Proteomics (web)





Added 6 new pathology subjects and 74 new pathology images.

Version 9: Updated 2020/01/24


Data TypeDownload all or Query/Filter
Images (DICOM, 18 GB)



  

(Requires NBIA Data Retriever)

Tissue Slide Images (SVS, 341 GB)




Proteomics (web)





Added 51 new pathology patients and 367 new WSIs.

Version 8: Updated 2019/12/19


Data TypeDownload all or Query/Filter
Images (DICOM, 18 GB)



  

(Requires NBIA Data Retriever)

Tissue Slide Images (SVS, 148 GB)




Proteomics (web)





Added 3 new radiological subjects.

Transferred two subjects radiology imaging (C3N-00738 C3N-02290) to CPTAC-LUAD as a correction.

Version 7: Updated 2019/09/30


Data TypeDownload all or Query/Filter
Images (DICOM, 19.5 GB)



  

(Requires NBIA Data Retriever .)

Tissue Slide Images (SVS, 147.6 GB)




Proteomics (web)





Added New Subjects.

Version 6 : Updated 2019/06/30


Data TypeDownload all or Query/Filter
Images (DICOM, 13.9 GB)

 


 
(Requires NBIA Data Retriever)

Tissue Slide Images (SVS, 147.6 GB)




Proteomics (web)





Added Subjects.

Version 5: Updated 2019/03/31


Data TypeDownload all or Query/Filter
Images (DICOM, 12.6 GB)

 


 

(Requires NBIA Data Retriever)

Tissue Slide Images (web)




Proteomics (web)





Added new subjects

Version 4 : Updated 2018/12/31


Data TypeDownload all or Query/Filter
Images (DICOM, 2.5 GB)



  

(Requires NBIA Data Retriever .)

Tissue Slide Images (web)




Proteomics (web)





Added new subjects.

Version 3: Updated 2018/06/30


Data TypeDownload all or Query/Filter
Images (DICOM, 0.6 GB)

 


 

(Requires NBIA Data Retriever)

Tissue Slide Images (web)




Proteomics (web)





Added new subjects.

Version 2: Updated 2018/01/10


Data TypeDownload all or Query/Filter
Images (DICOM, 0.4 GB)

 


 

(Requires NBIA Data Retriever)

Tissue Slide Images (web)




Proteomics (web)





Added new subjects.

Version 1: Updated 2018/01/10


Data TypeDownload all or Query/Filter
Images (DICOM, 0.132 GB)
Tissue Slide Images (web)




Proteomics (web)