Overview
The Cancer Imaging Archive (TCIA) is a publisher of cancer related data. Each Collection is issued a Digital Object Identifier (DOI) through DataCite so their contents become discoverable and associated metadata is made available to the community. DataCite provides a REST API that can be used to search metadata according to their published schema. Please note that this API was not developed by TCIA and is not supported through the TCIA help desk.
Please refer to the Documentation below for how to use the DataCite REST API to query TCIA metadata
- DataCite REST API - https://support.datacite.org/reference/introduction
- DataCite REST API Guide - https://support.datacite.org/docs/api
- DataCite Schema - http://schema.datacite.org
The Cancer Imaging Archive is identified within DataCite as
- prefix-id 10.7937
- client-id = sml.tcia
- provider-id = tciar
Properties
TCIA utilizes the following Properties of the DataCite schema.
Table 1: DataCite Mandatory Properties ID | |
---|---|
Property | |
1 | Identifier (DOI of the Dataset) |
2 | Creator (Authors of the Dataset, preferably with ORCIDID)) |
3 | Title (Published Title of the Dataset) |
4 | Publisher (The Cancer Imaging Archive) |
5 | PublicationYear (The Year the Dataset was published in TCIA) |
10 | ResourceType (Dataset; Equivalent to a TCIA Collection) |
Table 2: DataCite Recommended and Optional Properties ID | |
---|---|
Property | |
11 | AlternateIdentifier (TCIA "Short Name" for the Dataset. These short names appear in various places such as https://www.cancerimagingarchive.net/collections/ and https://www.cancerimagingarchive.net/tcia-analysis-results/) |
15 | Version (The Current Version of the Dataset) |
16 | Rights (Licensing Information) |
17 | Description (Dataset Abstract) |
Examples
Retrieve a single DataCite record in JSON format.
For this example we are using a Published Collection called "Pseudo-PHI-DICOM-Data":
Return a list of DOIs using the TCIA provider id (tciar)
https://api.datacite.org/dois?provider-id=tciar
By default, only 25 records are returned. You can control the number of records returned using pagination options. For example, to return only 5 records
https://api.datacite.org/dois?provider-id=tciar&page[size]=5
or
Query on specific information populated in the DataCite schema
For instance, return the records published by The Cancer Imaging Archive that were published in 2016:
or
Use the "activities" endpoint to see metadata updates in JSON format for a specified DataCite record.
For this example we are using a Published Collection called "Pseudo-PHI-DICOM-Data":