Overview
The Cancer Imaging Archive (TCIA) is a publisher of cancer related data. Each Collection TCIA publishes is issued a Digital Object Identifier (DOI) through DataCite so their contents become discoverable and associated metadata is made available to the community. DataCite provides a REST API that can be used to search metadata according to their published schema. DataCite. The DataCite Commons is a web search interface for the PID Graph, the graph formed by the collection of scholarly resources such as publications, datasets, people and research organizations, and their connections.
The DataCite REST API can be used to programmatically access Collection metadata such as their DOIs, titles and abstracts. Please note that this API was not developed by TCIA and is not supported through the TCIA help desk. Please refer to the Documentation below for how to use the DataCite REST API to query TCIA metadata. See https://support.datacite.org/ for any technical questions. The TCIA Helpdesk may be able to assist if your inquiry is related to the content of the data itself.
Official Datacite Documentation
The Cancer Imaging Archive is identified within DataCite as
- prefix-id 10.7937
- client-id = sml.tcia
- provider-id = tciar
Properties
TCIA Metadata in DataCite
TCIA utilizes the following Properties of the DataCite schema.
Table 1: DataCite Mandatory Properties ID Obligation Property Description |
---|
1 | Identifier |
Identifier ()M 2 Creator (Authors of the Dataset, preferably with ORCIDID |
))M 3 Title (Published Title of the Dataset |
) M 4 Publisher (The Cancer Imaging Archive |
)M 5 PublicationYear (The Year the Dataset was published in TCIA |
)M 10 ResourceType (Dataset; Equivalent to a TCIA Collection |
) Property | Obligation | 11 | AlternateIdentifier (M | Table 2: DataCite Recommended and Optional Properties ID | )O15Version (The Current Version of the Dataset |
)O16Rights ()O17 Description (Dataset Abstract) | R | ...
* indicates properties that are "Recommended and Optional" per the Datacite Schema whereas the others are required to create a DOI.
TCIA-Utils
The tcia_utils package contains functions to simplify common tasks one might perform when interacting with The Cancer Imaging Archive (TCIA) via Python. Issues with this package should be submitted at https://github.com/kirbyju/tcia_utils/issues. Installation can be achieved with this Pip command:
To import functions related to Datacite:
from tcia_utils import datacite
|
An example notebook demonstrating tcia_utils functionality with DataCite's API can be found at https://github.com/kirbyju/TCIA_Notebooks/blob/main/TCIA_DataCite_Queries.ipynb.
Example Queries
Info |
---|
title | Retrieve a single DataCite record in JSON format. |
---|
|
For this example we are using a Published Collection called "Pseudo-PHI-DICOM-Data": https://api.datacite.org/dois/10.7937/s17z-r072 |
...