The Cancer Imaging Archive (TCIA) staff has accumulated a wealth of knowledge on best practices and procedures for DICOM image de-identification in the process of maintaining our archive. In order to share this information with the wider research community we are maintaining the following knowledge base. This is a living document and will continue to be updated as we learn from our experiences. If you have feedback or questions please contact us at firstname.lastname@example.org.
DICOM Private Data Elements
It is desirable to retain DICOM private data elements that contain parameters describing the acquisition while removing elements containing PHI. Performing this task requires understanding the mechanism defined by DICOM to support private elements. DICOM PS 3.5, section 7.8.1 states:
It is possible that multiple implementors may define Private Elements with the same (odd) group number. To avoid conflicts, Private Elements shall be assigned Private Data Element Tags according to the following rules.
a) Private Creator Data Elements numbered (gggg,0010-00FF) (gggg is odd) shall be used to reserve a block of Elements with Group Number gggg for use by an individual implementor. The implementor shall insert an identification code in the first unused (unassigned) Element in this series to reserve a block of Private Elements. The VR of the private identification code shall be LO (Long String) and the VM shall be equal to 1.
b) Private Creator Data Element (gggg,0010), is a Type 1 Data Element that identifies the implementor reserving element (gggg,1000-10FF), Private Creator Data Element (gggg,0011) identifies the implementor reserving elements (gggg,1100-11FF), and so on, until Private Creator Data Element (gggg,00FF) identifies the implementor reserving elements (gggg,FF00-FFFF).
c) Encoders of Private Data Elements shall be able to dynamically assign private data to any available (unreserved) block(s) within the Private group, and specify this assignment through the blocks corresponding Private Creator Data Element(s). Decoders of Private Data shall be able to accept reserved blocks with a given Private Creator identification code at any position within the Private group specified by the blocks corresponding Private Creator Data Element.
We will use data in group 0009 as a practical example. The table below shows an example of data that could be included in group 0009.
Private Creator Element
Density Standard Deviation
In the example, the element with tag (0009, 0010) is a private creator element with value "ACME". That reserves a block of elements for this manufacturer. The element (0009, 1001) is part of that block; the 10 in the element tag (1001) corresponds to the 10 that is in the tag of the Private Creator Element (0009, 0010).
This only becomes complex when different manufacturers want to use the same reserved block to store information. When this occurs in a single image, the creator of the image reserves a block (for example, 0010). When a second application wants to add data to that same group, it detects the block written by the creator and creates a separate block (for example, 0011). The creator is not required to start at block 0010, but that appears to be common practice. The second or third application is not required to use 0011 or 0012. Based on this encoding scheme, some observations are:
- If a collection of images are produced by equipment from different manufacturers, you may have collisions in the sets of private elements you want to retain and discard. For example, element (0009, 1001) from manufacturer A may contain an important physical parameter while that same element from manufacturer B may contain PHI.
- If the collection has images that are created by an acquisition modality and are then modified by another application (PACS, workstation), a private group may have multiple reserved blocks. Also, one cannot assume that the original creator will have always chosen reserved block 0010.
TCIA De-identification Work Flow