Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Many collections on TCIA contain "labels" which can be used for training and testing artificial intelligence models.  However, users who are not familiar with medical imaging may benefit from assistance learning how to identify and utilize data types suitable for common deep learning tasks such as image classification, object detection, and object segmentations.  This page seeks to summarize key information about finding image labels that may not be obvious to researchers who do not have a background in radiology or histopathology.  It also provides a host of useful links from third party sources which could be useful to data scientists who are new to working with medical images.

Object Segmentation

Many collections on TCIA contain some kind of "ground truth" labels which provide information about object location and boundaries in the images.  In radiology images these are typically created by one or more radiologists hand-drawing boundaries around objects such as the patient's tumor(s) or organs on each image.  These kinds of data can be shared using a few different file formats. 

  1. DICOM provides support for these kinds of data using SEG and RTSTRUCT modalities.  
  2. Many popular open source tools export these labels in other formats.  Popular formats include NIFTI, NRRD, and MHA.

On TCIA you can find these data in a couple of ways.  

  1. For Collections datasets you can look for SEG / RTSTRUCT in the modality column to determine where DICOM segmentations or contours are available.  You can also filter for "Image Analyses" in the supporting data column.  If a collection says "Image Analyses" but does not include SEG or RTSTRUCT in the modality this is typically because the analysis was in some other format.  This could be segmentation data in NIFTI/NRRD/MHA formats, but it might also represent some other kind of analysis such as image classification.
  2. For Analysis Results of existing TCIA collections it is a bit more straightforward.  Simply use the filter above the table to search for "segmentations" which will find any instance of this in the Analysis Artifacts column.

Image classification

TCIA includes a wealth of non-image data which could be utilized for image classification purposes.  

...

  1. https://www.youtube.com/watch?v=-XUKq3B4sdw - how a radiologist interprets lung CTs?
  2. https://www.kaggle.com/gzuidhof/full-preprocessing-tutorial - how to pre-process images for deep learning
  3. https://theaisummer.com/medical-image-coordinates/ - DICOM deep learning for medical imaging novices
  4. https://developer.nvidia.com/clara-medical-imaging - NVIDIA package for simplifying deep learning tasks in medical imaging
  5. https://forums.fast.ai/t/fastai-v2-has-a-medical-imaging-submodule/56117 - FastAI package for simplifying deep learning in medical imaging
  6. "TCIA as a Centralized Data Resource for Development of AI" from RSNA 2019
  7. https://www.kaggle.com/marcovasquez/basic-eda-data-visualization - RSNA intracranial hemorrhaging guide  
  8. https://github.com/RSNA/AI-Deep-Learning-Lab - RSNA 2019 deep learning course
  9. https://github.com/RSNA/MagiciansCorner - Notebooks, datasets, other content for the Radiology:AI series known as Magicians Corner by Brad Erickson