Summary

This collection provides public access to a 3D pathology dataset of prostate cancer, allowing researchers to further investigate various 3D tissue structures and their correlation with prostate cancer patient outcomes (biochemical recurrence).   These 3D tissue structures are revealed through: (1) a H&E-analog stain, (2) synthetically generated immunofluorescence staining of CK8 (targeting the luminal epithelial cells of all prostate glands), and (3) 3D segmentation masks of the gland lumen, epithelium, and stromal regions of prostate biopsies.  This data collection will promote research in the field of computational 3D pathology for clinical decision support.

In this TCIA collection, we provide the 2x down-sampled fused OTLS-imaged images (H&E-analog staining), the synthetic cytokeratin-8 (CK8) immunofluorescent images at 2x-downsampled resolution, the 3D semantic segmentation masks of glands at 4x down-sampled resolution, the clinical data for patient outcomes (biochemical recurrence), and the coordinates for the cancer-enriched regions of each biopsy. All datasets are from the 50 patient cases studied in this publication: [W. Xie et al., Cancer Research, 2022].  The Python code for the deep-learning models, and for 3D glandular segmentations based on synthetic-CK8 datasets, are available on GitHub at https://github.com/WeisiX/ITAS3D.

Note that the 3D pathology datasets provided in this collection were generated in Dr. Jonathan Liu’s lab at the University of Washington with a custom open-top light-sheet (OTLS) microscope developed by the lab [A.K. Glaser et al., Nature Communications, 2019].  There is no clinical metadata within the i

Imaging files and all patients are referred to with coded identifiers.  All of the clinical outcomes data provided in this collection have already been published within the supplement of [W. Xie et al., Cancer Research, 2022].

Acknowledgements

We would like to acknowledge the individuals and institutions that have provided data for this collection:

Additional publications relating to the data



Data Access

Data TypeDownload all or Query/FilterLicense
Tissue Slide Images (HDF5, TIFF, XML 3.8TB)






(Download and apply the IBM-Aspera-Connect plugin to your browser to retrieve this faspex package) 

Clinical data (CSV, 64 KB)




Additional Resources for this Dataset

The following external resources have been made available by the data submitters.  These are not hosted or supported by TCIA, but may be useful to the researchers utilizing this collection

  • The Python source code for the deep-learning models, and for 3D glandular segmentations based on synthetic-CK8 datasets, are available on GitHub at https://github.com/WeisiX/ITAS3D.


Detailed Description

Image Statistics

Pathology Image Statistics

Modalities

Pathology

Number of Patients

50

Number of Images

118

Images Size (TB)3.8

Our 3D imaging method is entirely slide-free and non-destructive.  We image intact tissue specimens (prostate biopsies) that have been labeled with a fluorescent analog of H&E staining and optically cleared to make them transparent to light. The imaging is performed with a custom-developed in-house open-top light-sheet (OTLS) microscope [A.K. Glaser et al., Nature Communications, 2019].  

Our OTLS microscope uses an sCMOS camera to collect images of optically cleared tissues in a slice-by-slice manner as we translate the specimens through an illumination “light sheet.”  The sCMOS cameras generate raw 16-bit TIFF files (grayscale) that are assembled into volumetric imaging “tiles” within the RAM of the acquisition computer.  These imaging tiles are like volumetric bricks of data, as shown in Fig. 1G of [A.K. Glaser et al., Nature Communications, 2019].  Each tile is then saved to the hard drive in the form of a multi-resolution HDF5 file.  As adjacent imaging tiles are scanned within the specimen, they are appended into the same HDF5 in the hard drive. The HDF5 file is similar to a 3D TIFF stack except it contains multiple downsampled versions of the dataset so that users can quickly visualize or access the 3D data at whatever resolution they need.  The HDF5 file is also “chunked” so that smaller volumetric regions of interest can be quickly accessed.  The HDF5 file has an associated XML file that contains microscope metadata (e.g. stage coordinates for each tile, acquisition parameters, etc.).

Before performing computational analysis of the imaging data, we “fuse” the 3D imaging data contained in the raw HDF5 file.  This is done with an open-source program called “BigStitcher” that finely aligns all of the volumetric imaging tiles, shears the dataset, and blends the edges of those imaging tiles such that a “seamless” 3D image is generated.  In this project, we also downsample the imaging data by a factor of 2X in all three dimensions during this fusion process.  The fused dataset is saved as an uncompressed HDF5 file and is used as an input for our downstream ITAS3D segmentation pipeline [W. Xie et al., Cancer Research, 2022]. 

The fused HDF5 datasets (2X downsampled compared to the original data) are the rawest form of the imaging data that is provided in this TCIA collection.  These are two-channel grayscale volumetric datasets in which one channel is the fluorescent nuclear stain (TO-PRO-3) and the second channel is the eosin stain.  The TCIA collection also contains synthetic CK8 immunofluorescence datasets (2X downsampled compared to the original data) that are generated through a deep-learning image-translation model [W. Xie et al., Cancer Research, 2022].  The CK8 stain labels the luminal epithelial cells of all prostate glands (benign or cancerous).  Finally, we provide the segmentation masks of the lumen, epithelial, and stromal compartments of each biopsy (4X downsampled compared to the original data).



Citations & Data Usage Policy

Xie, W., Reder, N. P., Koyuncu, C. F., Leo, P., Hawley, S., Huang, H., Mao, C., POSTUPNA, N. A. D. I. A., kang, soyoung, Serafin, R., Gao, G., Han, Q., Bishop, K., Barner, L., Fu, P., Wright, J., Keene, C., Vaughan, J., Janowczyk, A., … Liu, J. (2023). 3D pathology of prostate biopsies with biochemical recurrence outcomes: raw H&E-analog datasets and image translation-assisted segmentation in 3D (ITAS3D) datasets (PCa_Bx_3Dpathology) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/44MA-GX21


Xie, W., Reder, N. P., Koyuncu, C., Leo, P., Hawley, S., Huang, H., Mao, C., Postupna, N., Kang, S., Serafin, R., Gao, G., Han, Q., Bishop, K. W., Barner, L. A., Fu, P., Wright, J. L., Keene, C. D., Vaughan, J. C., Janowczyk, A., … Liu, J. T. C. (2021). Prostate Cancer Risk Stratification via Nondestructive 3D Pathology with Deep Learning–Assisted Gland Analysis. In Cancer Research (Vol. 82, Issue 2, pp. 334–345). American Association for Cancer Research (AACR). https://doi.org/10.1158/0008-5472.can-21-2843


Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. In Journal of Digital Imaging (Vol. 26, Issue 6, pp. 1045–1057). Springer Science and Business Media LLC. https://doi.org/10.1007/s10278-013-9622-7

Additional Publication Resources:

The Collection authors suggest the below will give context to this dataset:

  • Glaser, A. K., Reder, N. P., Chen, Y., Yin, C., Wei, L., Kang, S., Barner, L. A., Xie, W., McCarty, E. F., Mao, C., Halpern, A. R., Stoltzfus, C. R., Daniels, J. S., Gerner, M. Y., Nicovich, P. R., Vaughan, J. C., True, L. D., & Liu, J. T. C. (2019). Multi-immersion open-top light-sheet microscope for high-throughput imaging of cleared tissues. In Nature Communications (Vol. 10, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1038/s41467-019-10534-0

Other Publications Using This Data

TCIA maintains a list of publications which leverage our data. If you have a publication you'd like to add please contact TCIA's Helpdesk.


Version 1 (Current): Updated 2023/03/07

Data TypeDownload all or Query/FilterLicense
Tissue Slide Images (HDF5, TIFF, XML 3.8TB)






(Download requires Aspera plugin)

Clinical data (CSV)