Child pages
  • H&E and immunohistochemical stain images of 209 cases of diffuse large B-cell lymphoma linked with cytogenetic features and clinical outcomes (DLBCL-Morphology)

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Summary

Excerpt

Image RemovedDiffuse Large B-Cell Lymphoma (DLBCL) is the most common non-Hodgkin lymphoma worldwide. DLBCL is fatal without treatment, but early detection and therapy can cure up to 70% of patients. The current best prognostic classification, the National Comprehensive Cancer Network International Prognostic Index, is insufficient to guide therapeutic decision-making for individual patients. No tumor-intrinsic prognostication method is currently available. 

Image Added

The DLBCL-Morph dataset contains 42 digital high-magnification scans of tissue microarrays (TMAs) containing tissue cores from 209 DLBCL cases at Stanford Hospital. Each DLBCL case is accompanied by survival data, follow-up status and a wide variety of clinical and cytogenetic variables. The TMAs are stained for H&E, which shows cell morphology, as well as for the expression of several prognostically relevant proteins: CD10, BCL6, MUM1, BCL2, and MYC. The TMAs are accompanied by pathologist-annotated regions of interest (ROIs) that specify areas of tissue representative of DLBCL. We used deep learning to segment out cancerous nuclei from the ROIs, and computed several geometric features for each cancerous nucleus, which are provided as part of our dataset. These geometric features quantify several morphologic properties of a nucleus, such as size and elongation, and can be used as input for automated prognostic models to predict survival. In addition, DLBCL-Morph contains 204 digital high-magnification whole-slide images (WSIs) from 149 DLBCL cases, stained for H&E.

A total of 152,194 patches (240x240 pixels each) were extracted from the H&E stained ROIs and a HoVer-Net model was used to segment tumor nuclei (1,035,909 binary masks).  Geometric descriptors were computed for each segmented nucleus and a Cox proportional hazards model was evaluated using A) only clinical features, B) only morphologic features, or C) both sets of features. The Cox model achieved a concordance index of A) 0.703 (p = 0.005) B) 0.645 (p = 0.07), and C) 0.723 (p < 0.001) on a randomly sampled validation set of 51 patients. Our findings suggest that a risk calculator based on both clinical and morphologic data could yield improved prognostic value for DLBCL without the need for additional diagnostic testing.

Several studies have thus far failed to conclusively demonstrate that morphologic classification can predict outcomes in DLBCL. Automated medical imaging methods on whole slide images (WSI) could potentially identify novel, prognostically significant morphological or immunohistochemical biomarkers. The ability of automated methods to identify prognostically relevant features on H&E sections that have eluded pathologists has been demonstrated (Beck et al Science Translational Med). Furthermore, if successful, automated image analysis could potentially be scaled up into a cost-effective alternative to current classification methods which are typically costly and/or labor intensive. A critical requirement for the development of such deep learning models is the availability of datasets containing WSIs appropriately stained to show cell morphology and oncogene expression, with accompanying prognostic outcome data.

...