Detailed Description | |
---|
Modalities | CT (computed tomography) DX (digital radiography) CR (computed radiography) SEG (DICOM segmentations)* | Number of Patients | 1010* | Number of Studies | 1308* | Number of Series | 1398* | Number of Images | 244,617* | Image Size (GB) | 125* |
*2/24/2020 Maintenance notes: Corrected table entries only (no additional data). Added SEG to Modalities (previously present but not listed). Corrected number of subjects from 1006 to 1010, corrected number of studies from 1296 to 1308, corrected number of series from 1296 to 1398, corrected number of images from 243,185 to 244,617, corrected image size from 124 to 125. The corrections are to the table information only. Reader Annotation and MarkupThese links help describe how to use the .XML annotation files which are packaged along with the images in The Cancer Imaging Archive. The option to include annotation files in the download is enabled by default, so the XML described here will be included when downloading the LIDC-IDRI images unless you specifically uncheck this option. If you are only interested in the XML files or you have already downloaded the images you can obtain them here: The following documentation explains the format and other relevant information about the XML annotation and markup files: Annotation and Markup Issues/Comments - For a subset of approximately 100 cases from among the initial 399 cases released, inconsistent rating systems were used among the 5 sites with regard to the spiculation and lobulation characteristics of lesions identified as nodules > 3 mm. The XML nodule characteristics data as it exists for some cases will be impacted by this error. We apologize for any inconvenience.
- Also note that the XML files do not store radiologist annotations in a manner that allows for a comparison of individual radiologist reads across cases (i.e., the first reader recorded in the XML file of one CT scan will not necessarily be the same radiologist as the first reader recorded in the XML file of another CT scan).
- March 2010: Contrary to previous documentation, the correct ordering for the subjective nodule lobulation and nodule spiculation rating scales stored in the XML files is 1=none to 5=marked. The issue of consistency noted above still remains to be corrected.
- On 2012-03-21 the XML associated with patient LIDC-IDRI-0101 was updated with a corrected version of the file.
- Per May 2018, Please note that errors exist for two xml files, 044.xml and 191.xml, where one reader recorded one nodule as a "nodule >= 3 mm" but neglected to assign ratings for the nodule characteristics. On June 28, 2018 the files were updated with an explanation at the point of the error in the XML files.
- Subject LIDC-IDRI-0396 (139.xml) had an incorrect SOP Instance UID for position 1420. This was fixed on June 28, 2018.
- Subject LIDC-IDRI-0510 has an assigned value of 5 for the internalStructure attribute in 187/255.xml. There is no 5th category for internalStructure so this should be considered invalid.
Nodule-Specific DetailsDiagnosis DataFor a limited set of cases, LIDC sites were able to identify diagnostic data associated with the case. - tcia-diagnosis-data-2012-04-20.xls
- Note: This project has concluded and we are not able to obtain any additional diagnosis data beyond what is available in the above link.
Data was collected for as many cases as possible and is associated at two levels: - Diagnosis at the patient level (diagnosis is associated with the patient)
- Diagnosis at the nodule level (where possible)
At each level, data was provided as to whether the nodule was: - Unknown (no data is available)
- Benign or non-malignant disease
- A malignancy that is a primary lung cancer
- A metastatic lesion that is associated with an extra-thoracic primary malignancy
For each lesion, there is also information provided as to how the diagnosis was established including options such as: - unknown - not clear how diagnosis was established
- review of radiological images to show 2 years of stable nodule
- biopsy
- surgical resection
- progression or response
Softwarepylidcpylidc is an Object-relational mapping (using SQLAlchemy) for the data provided in the LIDC dataset. Some of the capabilities of pylidc include query of LIDC annotations in SQL-like fashion, conversion of the nodule segmentation contours into voxel labels, and visualization of segmentations as image overlays. If you find this tool useful in your research please cite the following paper:
Matthew C. Hancock, Jerry F. Magnan. Lung nodule malignancy classification using only radiologist quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods. SPIE Journal of Medical Imaging. Dec. 2016. http://dx.doi.org/10.1117/1.JMI.3.4.044504 |
MAXMAX ("multi-purpose application for XML") performs nodule matching and pmap generation based on the XML files provided with the LIDC/IDRI Database. It also performs certain QA and QC tasks and other XML-related tasks. MAX is written in Perl and was developed under RedHat Linux. It has been run under Windows. Downloading MAX and its associated files implies acceptance of the following notice (also available here and in the distro as a text file): DISCLAIMER: MAX is not guaranteed to process all input correctly. Possible errors include (but are not limited to) the inability to process correctly some types of nodule ambiguity (where nodule ambiguity refers to overlap between nodule markings having complicated shapes or to overlap between a nodule marking and a non-nodule mark). Download the distro (max-V107.tgz); view/download ReadMe.txt (a text file that is also included in the distro). LIDC 2 Image Toolbox (Matlab)This tool is a community contribution developed by Thomas Lampert. It is designed for extracting individual annotations from the XML files and converting them, and the DICOM images, into TIF format for easier processing in Matlab (LIDC-IDRI dataset). It is available for download from: https://sites.google.com/site/tomalampert/code. |