Child pages
  • SDTM datasets of clinical data and measurements for selected cancer collections to TCIA
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »


The Data Integration & Imaging Informatics (DI-Cubed) project explored the issue of lack of standardized data capture at the point of data creation, as reflected in the non-image data accompanying 4 TCIA breast cancer collections (ISPY1BREAST-DIAGNOSISBreast-MRI-NACT-Pilot, TCGA-BRCA) and the Ivy Glioblastoma Atlas Project (Ivy GAP) brain cancer collection. The work addressed the desire for semantic interoperability between various NCI initiatives by aligning on common clinical metadata elements and supporting use cases that connect clinical, imaging, and genomics data. Accordingly, clinical and measurement data imported into I2B2 were cross-mapped to industry standard concepts for names and values including those derived from BRIDG, CDISC SDTM, DICOM Structured Reporting models and using NCI Thesaurus, SNOMED CT and LOINC controlled terminology. 

A subset of the standardized data was then exported from I2B2 in SDTM compliant SAS transport files.  The SDTM data was derived from data taken from both the curated TCIA spreadsheets as well as tumor measurements and dates from the TCIA Restful API.  Due to the nature of the available data not all SDTM conformance rules were applicable or adhered to.

These Study Data Tabulation Model format (SDTM) datasets were validated using Pinnacle 21 CDISC validation software. The validation software reviews datasets according to their degree of conformance to rules developed for the purposes of FDA submissions of electronic data.  Iterative refinements were made to the datasets based upon group discussions and feedback from the validation tool.

Export datasets for the following SDTM domains were generated:

  • DM (Demographics)
  • DS (Disposition)
  • MI (Microscopic Findings)
  • PR (Procedures)
  • SS (Subject Status)
  • TU (Tumor/Lesion Identification)
  • TR (Tumor/Lesion Results)

Data Access

Data TypeDownload all or Query/Filter
SAS Transport Files (XPT)

Image Analysis (CSV)

Note:  Please contact  with any questions regarding usage.

Detailed Description

TCIA breast cancer collections used:

TCIA brain cancer collection used:

Citations & Data Usage Policy 

These collections are freely available to browse, download, and use for commercial, scientific and educational purposes as outlined in the Creative Commons Attribution 3.0 Unported License. Questions may be directed to Please be sure to acknowledge both this data set and TCIA in publications by including the following citations in your work:

Data Citation

Hickman H., Ver Hoef W., Hastak S., Neville J., Clunie D., Wagner U., Helton E. (2019). SDTM datasets of clinical data and measurements for selected cancer collections to TCIA [Dataset]. The Cancer Imaging Archive. doi: 10.7937/TCIA.2019.zfv154m9

TCIA Citation

Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. (paper)

Other Publications Using This Data

TCIA maintains a list of publications that leverage TCIA data. If you have a manuscript you'd like to add please contact the TCIA Helpdesk.

Version 1 (Current): 2019/06/21

Data TypeDownload all or Query/Filter
SAS Transport Files (XPT)

Image Analysis (CSV)

  • No labels