Child pages
  • A morphological dataset of white blood cells from patients with four different genetic AML entities and non-malignant controls (AML-Cytomorphology_MLL_Helmholtz)

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: updating abstract to clarify existing inconsistencies (updated provided by PI)

Summary

Excerpt

This dataset comprises four prevalent AML subtypes with defining genetic abnormalities and typical morphological features according to the WHO 2022 classification: (i) APL with PML::RARA fusion, (ii) AML with NPM1 mutation, (iii) AML with CBFB::MYH11 fusion (without NPM1 mutation), and (iv) AML with RUNX1::RUNX1T1 fusion, as well as a control group of healthy stem cell donors. 


A total of

242

189 peripheral blood smears from the Munich Leukemia Laboratory (MLL) database from the years 2009 to 2020 were digitized

and consist of 99 - 500 automatically selected individual white blood cell images per patient. First

. First, all blood smears were scanned with 10x magnification and an overview image was created. Using the Metasystems Metafer platform, cell detection was performed automatically using a segmentation threshold and logarithmic color transformation. Further analysis regarding the quality of the region within the blood smear was performed automatically.

White

Per patient, 99-500 white blood cells

with the highest scores

were then scanned in 40x magnification via oil immersion microscopy in .TIF format, corresponding to 24,9μm x 24,9μm (144x144 pixels). For this, a CMOS Color Camera from MetaSystems with a resolution of 4096x3000px and a pixel size of 3,45μm x 3,45μm was used. Four pixels were binned into one, leading to a size of 6.9μm x 6.9μm, and a resolution of 6.9μm / 40 (1px = 0,1725μm).

Additional

 Additional information about patient age, sex and blood counts are provided in a separate .csv file.


To our knowledge, this dataset covers the morphological complexity of acute myeloid leukemia in peripheral blood smears in unseen quality and quantity.

With 4 different types of AML with defining genetic abnormalities and healthy controls, this dataset exceeds existing other datasets and thus brings the scientific community one step closer to real world hematology. We believe that our data can help scientists all over the world to develop new models and combine the data with other data sources to ameliorate AML diagnostics.

 

Acknowledgements

  • All samples were collected, diagnosed and scanned at the Munich Leukemia Laboratory (MLL). Carsten Marr has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 866411). Matthias Hehr acknowledges support from Deutsche José Carreras-Leukämie Stiftung.

...