The Lung Image Database Consortium research project (LIDC-IDRI) involves the generation of marked-up annotated lesions on the diagnostic and lung cancer screening thoracic CT scans found in the LIDC-IDRI image collection to create a web-accessible international resource for development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis.
The following paper published in Medical Physics is effectively the authoritative user's manual for the database and should be cited in all manuscripts that make use of the database:
Armato SG III, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA, Kazerooni EA, MacMahon H, van Beek EJR, Yankelevitz D, et al.: The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38: 915--931, 2011.
Additionally, please include the following attribution:
The authors acknowledge the National Cancer Institute and the Foundation for the National Institutes of Health and their critical role in the creation of the free publicly available LIDC/IDRI Database used in this study.
These links help describe how to use the .XML annotation files which are packaged along with the images in The Cancer Imaging Archive. The option to include annotation files in the download is enabled by default, so the XML described here will be included when downloading the LIDC-IDRI images unless you specifically uncheck this option. If you are only interested in the XML files or you have already downloaded the images you can obtain them here:
The following documentation explains the format and other relevant information about the XML annotation and markup files:
Annotation and Markup Issues/Comments
This link provides a list of available cases and the associated size of each identified nodule.
For a limited set of cases, LIDC sites were able to identify diagnostic data associated with the case.
Data was collected for as many cases as possible and is associated at two levels:
At each level, data was provided as to whether the nodule was:
For each lesion, there is also information provided as to how the diagnosis was established including options such as:
MAX ("multi-purpose application for XML") performs nodule matching and pmap generation based on the XML files provided with the LIDC/IDRI Database. It also performs certain QA and QC tasks and other XML-related tasks.
MAX is written in Perl and was developed under RedHat Linux. It has been run under Windows.
Downloading MAX and its associated files implies acceptance of the following notice (also available here and in the distro as a text file):
DISCLAIMER: MAX is not guaranteed to process all input correctly. Possible errors include (but are not limited to) the inability to process correctly some types of nodule ambiguity (where nodule ambiguity refers to overlap between nodule markings having complicated shapes or to overlap between a nodule marking and a non-nodule mark).
Download the distro (max-V107.tgz); view/download ReadMe.txt (a text file that is also included in the distro).
This tool is a community contribution developed by Thomas Lampert. It is designed for extracting individual annotations from the XML files and converting them, and the DICOM images, into TIF format for easier processing in Matlab (LIDC-IDRI dataset). It is available for download from: https://sites.google.com/site/tomalampert/code.