Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Panel
Table of Contents
maxLevel1

Overview

The Cancer Imaging Archive )TCIA) is a publisher of cancer related data. Each Collection TCIA publishes is issued a Digital Object Identifier (DOI) through DataCIte. DataCIte provides a REST API that DataCite.  The DataCite Commons is a web search interface for the PID Graph, the graph formed by the collection of scholarly resources such as publications, datasets, people and research organizations, and their connections. 

The DataCite REST API can be used to search metadata according to their published schema. programmatically access Collection metadata such as their DOIs, titles and abstracts.  Please note that this API was not developed by TCIA and is not supported through the TCIA help desk. Please refer to the Documentation below for how to use the DataCite REST API to query TCIA metadataTCIA. See https://support.datacite.org/ for any technical questions.  The TCIA Helpdesk may be able to assist if your inquiry is related to the content of the data itself.

Official Datacite Documentation

The Cancer Imaging Archive is identified within DataCite as

  • prefix-id 10.7937
  • client-id = sml.tcia
  • provider-id = tciar

Properties

TCIA Metadata in DataCite

TCIA utilizes the following Properties of the DataCite schema.

Table 1: DataCite Mandatory Properties ID 
Property IDProperty 
Obligation 
Property Description
Identifier 
Identifier (
DOI of the Dataset
)
Creator 
Creator (
Authors of the Dataset, preferably with ORCIDID
))
Title  
Title (
Published Title of the Dataset
Publisher 
Publisher (
The Cancer Imaging Archive
)
PublicationYear 
PublicationYear (
The Year the Dataset was published in TCIA
)
10 
10 
ResourceType  
ResourceType (
Dataset; Equivalent to a TCIA Collection
Table 2: DataCite Recommended and Optional Properties ID Property Obligation 11AlternateIdentifier (TCIA Short Name for the Dataset)O15Version (
11 *AlternateIdentifier TCIA "Short Name" for the Dataset.  These short names appear in various places such as https://www.cancerimagingarchive.net/collections/ and https://www.cancerimagingarchive.net/tcia-analysis-results/
15 *Version 
The Current Version of the Dataset
)
O
16 *
16
Rights 
Rights (
Licensing Information
)
O
17 *
17 
Description 
Description (Dataset Abstract)

...

Dataset Abstract

* indicates properties that are "Recommended and Optional" per the Datacite Schema whereas the others are required to create a DOI.

TCIA-Utils

The tcia_utils package contains functions to simplify common tasks one might perform when interacting with The Cancer Imaging Archive (TCIA) via Python.  Issues with this package should be submitted at https://github.com/kirbyju/tcia_utils/issuesInstallation can be achieved with this Pip command:

pip install tcia_utils

To import functions related to Datacite:

from tcia_utils import datacite

An example notebook demonstrating tcia_utils functionality with DataCite's API can be found at https://github.com/kirbyju/TCIA_Notebooks/blob/main/TCIA_DataCite_Queries.ipynb. 

Example Queries


Info
titleRetrieve a single DataCite record in JSON format.

For this example we are using a Published Collection called "Pseudo-PHI-DICOM-Data":

...

https://api.datacite.org/dois/10.7937/s17z-r072


Info
titleReturn a list of DOIs using the TCIA provider id (tciar)

https://api.datacite.org/dois?provider-id=tciar

By default, only 25 records are returned. You can control the number of records returned using pagination options. For example, to return only 5 records

https://api.datacite.org/dois?provider-id=tciar&page[size]=5 

or

https://api.datacite.org/providers/tciar/dois?page[size]=5


Info
titleQuery on specific information populated in the DataCite schema

For instance,

...

return the records published by The Cancer Imaging Archive that were published in 2016:

https://api.datacite.org/dois?query=publisher:%22The%20Cancer%20Imaging%20Archive%22+publicationYear:2016

or

https://api.datacite.org/providers/tciar/dois?created=2016


Info
titleUse the "activities" endpoint to see metadata

...

updates in JSON

...

format for a specified DataCite record.

For this example we are using a Published Collection called "Pseudo-PHI-DICOM-Data":

...

https://api.datacite.org/dois/10.7937/s17z-r072/activities