The detection of breast cancer metastases to lymph nodes is of great prognostic value for patient treatment. Using machine learning to detect metastatic breast cancer to lymph nodes can increase efficiency of pathologist diagnosis and ultimately ensure patients are accurately staged for prospective treatment. This dataset allows for the objective comparison of breast cancer metastases detection algorithms.
The dataset consists of 130 de-identified whole slide images of H&E stained axillary lymph node specimens from 78 patients. Metastatic breast carcinoma is present in 36 of the WSI from 27 patients. No patient inclusion/exclusion criteria were followed. No slide inclusion/exclusion criteria were followed. The slides were scanned at Memorial Sloan Kettering Cancer Center (MSKCC) with Leica Aperio AT2 scanners at 20x equivalent magnification (0.5 microns per pixel). Together with the slides, the class label of each slide, either positive or negative for breast carcinoma, is given. The slide class label was obtained from the pathology report of the respective case.
Please note that Box has a 15GB download limit, so you will need to download images in batches.
Click the Versions tab for more info about data releases.
Number of Patients
Number of Images
|Images Size (GB)||53|
Explanation of target.csv files
target.csv contains a binary label for each slide image in the dataset.
- target=1 means that the image contains breast cancer metastases.
- target=0 means that the image does not contain breast cancer metastases.
Citations & Data Usage Policy
These collections are freely available to browse, download, and use for commercial, scientific and educational purposes as outlined in the Creative Commons Attribution 3.0 Unported License. Questions may be directed to email@example.com. Please be sure to acknowledge both this data set and TCIA in publications by including the following citations in your work:
"Campanella, G., Hanna, M. G., Brogi, E., & Fuchs, T. J. (2019). Breast Metastases to Axillary Lymph Nodes [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.3xbn2jcc"
“Clinical-grade Computational Pathology using Weakly Supervised Deep Learning on Whole Slide Images”, Gabriele Campanella, Matthew G. Hanna, Luke Geneslaw, Allen Miraflor, Vitor Werneck Krauss Silva, Klaus J. Busam, Edi Brogi, Victor E. Reuter, David S. Klimstra, Thomas J. Fuchs, Nature Medicine, July 2019
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: 10.1007/s10278-013-9622-7