Child pages
  • Data from The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans (LIDC-IDRI)

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 
Additional information about using this data as well as some collection meta data can be obtained in the Supporting Documentation belowon the Lung Image Database Consortium research page.

Data Access

Collection Statistics

 

Modalities

CT (computed tomography)
DX (digital radiography)
CR (computed radiography)

Number of Patients

1,010

Number of Studies

1,308

Number of Series

1,018 CT
290 CR/DX

Number of Images

244,527

You can view and download these images on the Cancer Imaging Archive by selecting the LIDC-IDRI collection. If you are unsure how to download this Collection view our quick guide on Searching by Collection or you can refer to our The Cancer Imaging Archive User's Guide for more detailed instructions on using the site.

Supporting Documentation

More information about the Cancer Imaging Program's Program Announcement for LIDC can be found at: http://imaging.cancer.gov/programsandresources/InformationSystems/LIDC

Reader Annotation and Markup

These links help describe how to use the .XML annotation files which are packaged along with the images in the Cancer Imaging Archive.  The option to include annotation files in the download is enabled by default, so the XML described here will be included when downloading the LIDC-IDRI images unless you specifically uncheck this option.

Annotation and Markup Issues/Comments

  1. For a subset of approximately 100 cases from among the initial 399 cases released, inconsistent rating systems were used among the 5 sites with regard to the spiculation and lobulation characteristics of lesions identified as nodules > 3 mm. The XML nodule characteristics data as it exists for some cases will be impacted by this error. We apologize for any inconvenience.
  2. Also note that the XML files do not store radiologist annotations in a manner that allows for a comparison of individual radiologist reads across cases (i.e., the first reader recorded in the XML file of one CT scan will not necessarily be the same radiologist as the first reader recorded in the XML file of another CT scan).
  3. March 2010: Contrary to previous documentation, the correct ordering for the subjective nodule lobulation and nodule spiculation rating scales stored in the XML files is 1=none to 5=marked. The issue of consistency noted in issue 1 still remains to be corrected.

Nodule Size List

This link provides a list of available cases and the associated size of each identified nodule.

Diagnosis Data

For a limited set of cases, LIDC sites were able to identify diagnostic data associated with the case. Data was collected for as many cases as possible and is associated at two levels:

  1. Diagnosis at the patient level (diagnosis is associated with the patient)
  2. Diagnosis at the nodule level (where possible)

At each level, data was provided as to whether the nodule was:

  1. Unknown (no data is available)
  2. Benign or non-malignant disease
  3. A malignancy that is a primary lung cancer
  4. A metastatic lesion that is associated with an extra-thoracic primary malignancy

For each lesion, there is also information provided as to how the diagnosis was established including options such as:

  1. unknown - not clear how diagnosis was established
  2. review of radiological images to show 2 years of stable nodule
  3. biopsy
  4. surgical resection
  5. progression or response

Note: This data has not yet been updated to match the new patient ID structure for the LIDC-IDRI data set.  We hope to have an updated spreadsheet available soon.

AIM Annotation Conversion Project

As part of an effort to move towards standard formats for annotation and markup a project was undertaken to convert XML data from the LIDC Pilot project into Annotated Image Markup format (AIM).  AIM is a standard which was developed out of the caBIG program.  More information about this effort can be found here on the NCI CBIIT wiki: LIDC Conversion to AIM.

We hope to be able to provide the entire LIDC-IDRI set of markup in AIM format at some point in the future, along with a release of the ClearCanvas open source workstation which can view these markups.  However at this time the project has been placed on hold.  We will update this page if/when the status of this project changes.

Software

MAX

MAX ("multi-purpose application for XML") performs nodule matching and pmap generation based on the XML files provided with the LIDC/IDRI Database. It also performs certain QA and QC tasks and other XML-related tasks.

MAX is written in Perl and was developed under RedHat Linux. It has been run under Windows.

Downloading MAX and its associated files implies acceptance of the following notice (also available here and in the distro as a text file):

Image Removed

DISCLAIMER: MAX is not guaranteed to process all input correctly. Possible errors include (but are not limited to) the inability to process correctly some types of nodule ambiguity (where nodule ambiguity refers to overlap between nodule markings having complicated shapes or to overlap between a nodule marking and a non-nodule mark).

Lung Image Database Consortium Research

This collection contains a great deal of supporting documentation that was generated by members of the LIDC. It can be found on the Lung Image Database Consortium research pageDownload the distro (max-V107.tgz); view/download ReadMe.txt (a text file that is also included in the distro).