Skip to end of metadata
Go to start of metadata

Introduction

This page describes details of the DICOM Tag Sniffer, both theory and operation.

The Tag Sniffer extracts and stores DICOM attributes in a local database and generates several reports for review by the imaging data owner and the sponsoring organization for the purpose of determining appropriate de-identification of DICOM images. We will describe first what is recorded locally and then describe what is exported for further analysis.

  1. DICOM Study Level information describes the imaging study for a participant on a particular date. This is intended to be used only by the owner of the imaging data and never exported. It is used by the owner to confirm that the proper imaging studies have been collected. That is, all studies for the selected participants are present and no studies from other participants are inadvertently included.
  2. DICOM Series Level information contains information about the equipment (modality, manufacturer, model name, software version) as well as some parameters that describe the acquisition.
  3. Private tags are the list of private attributes that are found in the collection of images. The Tag Sniffer records the attributes and their value representations and correlates these with the imaging equipment.  The values of the attributes are not recorded.
  4. Text tags are a set of public attributes that are extracted from the imaging set. The list of attributes are correspond to the Clean Description Option in DICOM Supplement 142. The attribute tags and values are recorded.

Exclusions

A product of the scanning process is the file exclusions.txt. This first line of this file is the root folder that was used to start the scan process. Any other lines contain the names of files that could not be opened by the Tag Sniffer software or are files that would not be included for submission to the Cancer Imaging Archive. Examples include:

  • JPEG thumbnails that might have been created as part of some other process
  • DICOM Ultrasound images (likely to have PHI burned into pixel data).
  • DICOM Secondary Capture images (like to have PHI burned into pixel data).

The documented procedure is that the owner of the data should review this file. Any entries listing files should be resolved before application of the CTP process to export data. That is, we do not wish to have CTP process the Ultrasound and Secondary Capture images, nor do we want CTP to have to open and ultimately reject JPEG or other files.

Use and Export of Data

  1. The file study_dump.txt contains imaging Study Level information for the images scanned by the Tag Sniffer. It likely contains PHI and is intended to be viewed only by the owner of the imaging data. Do NOT export this file.
  2. Two files are generated with Series Level information: 1) full_series_dump.txt, and 2) full_series_with_desc_dump.txt. The file full_series_dump.txt contains modality information and acquisition parameters, but no text fields that might contain PHIl. full_series_with_desc_dump.txt contains the same information as full_series_dump.txt but adds (0008 103E) Series Description. You can export full_series_dump.txt as it will be free of PHI. The Cancer Imaging Archive would prefer the data in full_series_with_desc_dump.txt because of the extra information, but the owner needs to check for PHI first.
  3. text_tags_dump.txt is an extract of the attributes listed below and unique values for each attribute. If a phrase or term is repeated in a DICOM attribute, that phrase or term is recorded only once. This file is NOT intended for export. The owner of the imaging data should review the file and determine which attributes contain PHI and which do not. That information can then be forwarded to the staff who will determine how to properly de-identify the data. That is, the owner of the data would create a report that indicates which attributes contain PHI and which do not (Yes, No). No actual values are to be exported.

Date / Time Elements

The TagSniffer records the Study Date for consistency checks by the owner of the imaging data. This date is stored with the study information and is not intended to be exported.

There are many elements of type Date, Time, or DateTime that are listed by Supplement 142 under the Clean Description Option. The TagSniffer does not extract those values under the assumption that the owner of the data or the sponsor of the project does not need to see the information to decide to include or exclude those values.

Attributes Recorded

This section contains the list of DICOM attributes that are recorded by the DICOM Tag Sniffer. 

Text Attributes

This table is a list of the public attributes that are extracted by the Tag Sniffer for examination by the owner of the data. All of the attributes below are text based. Sequence items are discussed in the next section.

Attribute Name

Tag

Acquisition Comments

0018 4000

Acquistion Device Processing Description

0018 1400

Acquistion Protocol Description

0018 9424

Additional Patient's History

0010 21B0

Admitting Diagnoses Description

0008 1080

Allergies

0010 2110

Comments on Performed Procedure Step

0040 0280

Contrast Bolus Agent

0018 0010

Contribution Description

0018 A003

Derivation Description

0008 2111

Discharge Diagnosis Description

0038 0040

Frame comments

0020 9158

Identifying Comments

0008 4000

Imaging Service Request Comments

0040 2400

Impressions

4008 0300

Interpretation diagnosis Description

4008 0115

Interpretation Text

4008 010B

Medical Alerts

0010 2000

Occupation

0010 2180

Patient Comments

0010 4000

Patient State

0038 0500

Performed Procedure Step Description

0040 0254

Pre-Medication

0040 0012

Protocol Name

0018 1030

Reason for Imaging Service Request

0040 2001

Reason for Study

0032 1030

Requested Contrast Agent

0032 1070

Requested Procedure Comments

0040 1400

Requested Procedure Description

0032 1060

Results Comments

4008 4000

Scheduled Procedure Step Description

0040 0007

Series Description

0008 103E

Service Episode Descripton

0038 0062

Special Needs

0038 0050

Study Comments

0032 4000

Study Description

0008 1030

Timezone Offset From UTC

0008 0201

Visit Comments

0038 4000

Sequence Attributes

Supplement 142 lists a set of attributes that are sequences. These are NOT yet extracted by the Tag Sniffer software.

Attribute Name

Tag

Acquisition Context Sequence

0040 0555

Admitting Diagnoses Code Sequence

0008 1084

Context Sequence

0040 A730

Graphic Annotation Sequence

0070 0001

Request Attributes Sequence

0040 0275


  • No labels