The TCIA provides standards‐based curation support to ensure safe and thorough de‐identification of all images in the archive per federal HIPAA and HITECH regulations. In order to achieve this compliance without stripping the data of its scientific utility TCIA staff perform a redundant, thorough de‐identification and analysis procedure based on guidance provided by the industry experts in DICOM standards committee Working Group 18. After initial testing TCIA image curators individually inspect every image, both in the DICOM tags and the image pixels to ensure there is no PHI. Changes to the de‐identification procedure are made as appropriate to correct any potential issues found by our curation team. After the completion of the image submissions the curation team again inspects every image in the full data set to ensure regulatory compliance. Each collection submitted for publication is analyzed and de-identified as a whole using the steps listed below. All steps are completed before the collection is released for publication.
- Each image in the collection is visually inspected to guarantee there is no PHI burned into the pixel data.
- TagSniffer is used to review the collection and produce an Element Inventory that is annotated with data from the DICOM Basic Application Confidentiality Profile and our set of Modality Software Profiles. This produces the list of DICOM elements found in the collection with a simple annotation scheme:
- One of the Basic Application Confidentiality Profile codes that indicates the DICOM scheme for de-identification (if the element is listed by DICOM)
- A simple code from our Modality Software Profile (No PHI: Retain, PHI: Delete, Not Sure: Review)
- No code, indicating the element is not registered
- The Pre-Identification output of the Tag Sniffer is also generated. This will contain the set of elements in the collection and all values that need to be reviewed for PHI. If the Basic Application Confidentiality Profile or applicable Modality Software Profile indicates the attribute is to be cleaned or that the attribute is a physical parameter that does not contain PHI, there is no need to review that element at this step. We know that our de-identification script will process the element properly.
- We combine the information from steps 2 and 3 to create a CTP de-identification script for the collection. In the event of multiple scanners from different manufacturers, we might create and apply different scripts based on manufacturer.
- The CTP de-identification script (or scripts) is (are) applied to the image collection and a separate copy of the images is created. That is, we retain the original set in case we need to repeat a step.
- TagSniffer is used to review the de-identified images and create the Final Review Output. This is a more complete output that is reviewed by analysts to guarantee there is no PHI carried forward after de-identification. Both public and private elements are included in the output for review.
- If any errors are detected in de-identification in step 6, the CTP script is adjusted and the image set is processed again starting at step 5.
Only after this inspection is complete are the images made available to the general public. For general information on what to expect as an image provider please see our web site at http://www.cancerimagingarchive.net/provider.html.