...
Breast cancer is among the most common cancers and a common cause of death among women. Over 39 million breast cancer screening exams are performed every year and are among the most common radiological tests. This creates a high need for accurate image interpretation. Machine learning has shown promise in interpretation of medical images. However, limited data for training and validation remains an issue.
Here, we share a curated dataset of digital breast tomosynthesis images that includes normal, actionable, biopsy-proven benign, and biopsy-proven cancer cases. The dataset contains four components: (1) DICOM images, (2) a spreadsheet indicating which group each case belongs to, and (3) annotation boxes. A detailed description of this dataset can be found in the following paper:
Info |
---|
title | Publication Citation |
---|
|
M. Buda, A. Saha, R. Walsh, S. Ghate, N. Li, A. Święcicki, J. Y. Lo, M. A. Mazurowski, Detection of masses and architectural distortions in digital breast tomosynthesis: a publicly available dataset of 5,060 patients and a deep learning model. arXiv preprint arXiv:2011.07995. |
Please reference this paper if you use this dataset. Version 1 of the dataset contains only a subset of all data described in the paper above. More data will be share in subsequent versions.
Acknowledgements
We would like to acknowledge the individuals and institutions that have provided data for this collection:
Duke University Hospital/Duke University, Durham, NC, USA
We would like to acknowledge all those who contributed to the curation of this dataset
This work was supported by a grant from the NIH: 1 R01 EB021360 (PI: Mazurowski).
Localtab Group |
---|
Localtab |
---|
active | true |
---|
title | Data Access |
---|
| Data AccessClick the Download button to save a ".tcia" manifest file to your computer, which you must open with the NBIA Data Retriever . Click the Search button to open our Data Portal, where you can browse the data collection and/or download a subset of its contents. Data Type | Download all or Query/Filter |
---|
Images (DICOM, XX.X GB) DBT | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/DBT-Challenge-Train.TCIA?api=v2 |
---|
| |
(Search button will not work until the data are ready to be released) | Image Metadata (csv) | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/Duke%20Breast%20DBT%20file-paths-train.csv?api=v2 |
---|
| |
| Boxes indicating lesion locations (csv) | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/Duke%20Breast%20DBT%20boxes-train.csv?api=v2 |
---|
| |
| Spreadsheet indicating which group each cases belongs to (see the paper for details on the groups) (csv) | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/Duke%20Breast%20DBT%20labels-train.csv?api=v2 |
---|
| |
| Click the Versions tab for more info about data releases. |
Localtab |
---|
title | Detailed Description |
---|
| Detailed Description | |
---|
Modalities | DBT | Number of Participants | 693 | Number of Studies | 700 | Number of Series | 2596 | Number of Images | 2596 | Images Size (GB, compressed) | Added when data released |
|
Localtab |
---|
title | Citations & Data Usage Policy |
---|
| Citations & Data Usage PolicyAdd any special restrictions in here. Tcia license 4 noncommercial |
---|
Info |
---|
| Buda, M., Saha, A., Li, N., Mazurowski, M.A. (2020). Data from the Breast Cancer Screening DBT. Data from The Cancer Imaging Archive. (2020). httphttps://doi.datacite.org (Coming soon.10.7937/e4wt-cd02 (draft DOI, will not resolve until published). |
Info |
---|
title | Publication Citation |
---|
| Buda, M. , SahaBuda, A. , WalshSaha, R. , GhateWalsh, S. , LiGhate, N. , ŚwięcickiLi, A. , LoŚwięcicki, J. Y. , MazurowskiLo, M. A. Mazurowski, Detection of masses and architectural distortions in digital breast tomosynthesis: a publicly available dataset of 5,060 patients and a deep learning model. arXiv preprint https://arxiv.org/abs/2011.07995. |
Info |
---|
| Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: 10.1007/s10278-013-9622-7 |
Other Publications Using This DataTCIA maintains a list of publications which leverage TCIA data. If you have a manuscript you'd like to add please contact the TCIA Helpdesk. |
Localtab |
---|
| Version 1 (Current): Updated 2020/mm/dd Data Type | Download all or Query/Filter |
---|
Images (DICOM, XX.X GB) DBT | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/DBT-Challenge-Train.TCIA?api=v2 |
---|
| |
(Requires NBIA Data Retriever .) (Search button will not work until the data is ready to be released) | Image Metadata (csv) | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/Duke%20Breast%20DBT%20file-paths-train.csv?api=v2 |
---|
| |
| Boxes indicating lesion locations (csv) | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/Duke%20Breast%20DBT%20boxes-train.csv?api=v2 |
---|
| |
| Spreadsheet indicating which group each cases belongs to (see the paper for details on the groups) (csv) | Tcia button generator |
---|
url | https://wiki.cancerimagingarchive.net/download/attachments/64685580/Duke%20Breast%20DBT%20labels-train.csv?api=v2 |
---|
| |
|
|
|