Summary
This dataset includes brain MRI scans of adult brain glioma patients, comprising of 4 structural modalities (i.e., T1, T1c, T2, T2-FLAIR) and associated manually generated ground truth labels for each tumor sub-region (enhancement, necrosis, edema), as well as their MGMT promoter methylation status. These scans are a collection of data from existing TCIA collections, but also cases provided by individual institutions and willing to share with a cc-by license.
The BraTS dataset describes a retrospective collection of brain tumor structural mpMRI scans of 2,040 patients (1,480 here), acquired from multiple different institutions under standard clinical conditions, but with different equipment and imaging protocols, resulting in a vastly heterogeneous image quality reflecting diverse clinical practice across different institutions. The 4 structural mpMRI scans included in the BraTS challenge describe a) native (T1) and b) post-contrast T1-weighted (T1Gd (Gadolinium)), c) T2-weighted (T2), and d) T2 Fluid Attenuated Inversion Recovery (T2-FLAIR) volumes, acquired with different protocols and various scanners from multiple institutions. Furthermore, data on the O[6]-methylguanine-DNA methyltransferase (MGMT) promoter methylation status is provided as a binary label. Notably, MGMT is a DNA repair enzyme that the methylation of its promoter in newly diagnosed glioblastoma has been identified as a favorable prognostic factor and a predictor of chemotherapy response.
It is curated for computational image analysis of segmentation and prediction of the MGMT promoter methylation status.
A note about available TCIA data which were converted for use in this Challenge: (Training, Validation, Test)
Dr. Bakas's group here provides brain-extracted Segmentation task BraTS 2021 challenge TRAINING and VALIDATION set data in NIfTI that do not pose DUA-level risk of potential facial reidentification, and segmentations to go with them.
This group has provided some of the brain-extracted BraTS challenge TEST data in NIfTI, and segmentations to go with them (here and here, from the 2018 challenge, request via TCIA's Helpdesk.
This group here provides brain-extracted Classification task BraTS 2021 challenge TRAINING and VALIDATION set data includes DICOM→ NIfTI→ dcm files, registered to original orientation, data files that do not strictly adhere to the DICOM standard. BraTS 2021 Classification challenge TEST files are unavailable at this time.
You may want the original corresponding DICOM-format files drawn from TCIA Collections; please note that these original data are not brain-extracted and may pose enough reidentification risk that TCIA must keep them behind an explicit usage agreement.
Please also note that specificity of which exact series in DICOM became which exact volume in NIfTI has, unfortunately, been lost to time but the available lists below represent our best effort at reconstructing the link to the BraTS source files.
Acknowledgements
We would like to acknowledge the individuals and institutions that have provided data for this collection:
- Data used in this publication were obtained as part of the RSNA-ASNR-MICCAI Brain Tumor Segmentation (BraTS) Challenge project through Synapse ID (syn25829067).
Data Access
Data Type | Download all or Query/Filter | License |
---|---|---|
Challenge data both tasks (142 GB, 1480 patients, NIfTI, DICOM) | (Download and apply the IBM-Aspera-Connect plugin to your browser to retrieve this faspex package) | |
ID Crosswalk between BraTS ID and TCIA ID (xlsx, 79 kB) |
Click the Versions tab for more info about data releases.
Collections Used in this Third Party Analysis
Below is a list of the Collections used in these analyses.
Source Data Type | Download | License |
---|---|---|
Original corresponding DICOM used in BraTS 2021 Segmentation Training set from CPTAC-GBM , TCGA-GBM , TCGA-LGG , ACRIN-FMISO-Brain (ACRIN 6684) , IvyGAP ,UPENN-GBM | ||
Original corresponding DICOM used in BraTS 2021 MGMT Classifier Training set from CPTAC-GBM , TCGA-GBM , IvyGAP , UPENN-GBM | ||
Original corresponding DICOM used in BraTS 2021 Segmentation Validation set from CPTAC-GBM , TCGA-GBM , TCGA-LGG , IvyGAP , UPENN-GBM | ||
Original corresponding DICOM used in BraTS 2021 MGMT Classifier Validation set from CPTAC-GBM , TCGA-GBM , IvyGAP , UPENN-GBM | ||
Original corresponding imaging from UCSF-PDGM v1 | (Download and apply the IBM-Aspera-Connect plugin to your browser to retrieve this faspex package) | CC BY 4.0 |
Additional Resources for this Dataset
The NCI Cancer Research Data Commons (CRDC) provides access to additional data and a cloud-based data science infrastructure that connects data sets with analytics tools to allow users to share, integrate, analyze, and visualize cancer research data.
- Imaging Data Commons (IDC) (Imaging Data)
- Genomic Data Commons (GDC) (Genomic, Digitized Histopathology & Clinical Data)
- Proteomic Data Commons (PDC) (Proteomic & Clinical Data)
The following external resources have been made available by the data submitters. These are not hosted or supported by TCIA, but may be useful to researchers utilizing this collection.
IvyGAP provides access to additional resources for this data:
Detailed Description
Image Statistics | Radiology Image Statistics |
---|---|
Modalities | MR, Segmentations |
Number of Patients | 1,480 |
Number of Studies | |
Number of Series | 7,131 |
Number of Images | 407,245 |
Images Size (GB) | 140 |
NOTE: The "challenge test set dataset" is sequestered on synapse.org (Project SynID: syn25829067). Please see their site for more detail.
NOTE: Segmentation task nifti: Number of Images 7,131 (Seg) , Images Size (GB)12 (Seg)
NOTE: Classification task nifti+DICOM: Number of Images 400,114 (Class), Images Size (GB) 128 (Class)
Segmentation labels of the different glioma sub-regions considered for evaluation are the "enhancing tumor" (ET), the "tumor core" (TC), and the "whole tumor" (WT). The ET is described by areas that show hyper-intensity in T1Gd when compared to T1, but also when compared to “healthy” white matter in T1Gd. The TC describes the bulk of the tumor, which is what is typically resected. The TC entails the ET, as well as the necrotic (NCR) parts of the tumor. The appearance of NCR is typically hypo-intense in T1-Gd when compared to T1. The WT describes the complete extent of the disease, as it entails the TC and the peritumoral edematous/invaded tissue (ED), which is typically depicted by hyper-intense signal in FLAIR. The provided segmentation labels have values of 1 for NCR, 2 for ED, 4 for ET, and 0 for everything else.
The data used in BraTS Challenges often have some overlap with other TCIA Collections, cases, and series. Some filters for handling these, so that you can work with statistically not-duplicated images, include these below:
- Manifest of case identifiers between BraTS and TCIA, NOTE: includes new series files with no TCIA equivalent: BraTS2021_MappingToTCIA.xlsx
- Spreadsheet list of cases and series used in prior year BraTS Challenges may also refer to these:
- Multimodal Brain Tumor Segmentation Challenge 2018 (BraTS)
- Multimodal Brain Tumor Segmentation Challenge 2019
- Segmentation Labels and Radiomic Features for the Pre-operative Scans of the TCGA-GBM collection (BraTS-TCGA-GBM)
- Segmentation Labels and Radiomic Features for the Pre-operative Scans of the TCGA-LGG collection (BraTS-TCGA-LGG)
- Spreadsheet list of new (NIfTI) series files with no TCIA DICOM equivalent: NotPreviouslyInTCIA.csv
You might find these splits useful to navigate accidental duplication while making superset cohorts. These were processed as input to the BraTS Collection, and will require a Usage Agreement on file.
Segmentation Task (Training sets) BraTS2021_TCIAderived_Seg-Task-Training.tcia
- Classification Task (Training sets) BraTS2021_TCIAderived_Class-Task-Training.tcia
- Segmentation Task (Validation sets) BraTS2021_TCIAderived_Seg-Task-Validation.tcia
- Classification Task (Validation sets) BraTS2021_TCIAderived_Class-Task-Validation.tcia
- We didn't split the UCSF-PDGM v1 data by BraTS task, but excerpted series in 299 cases are here as a faspex package: BraTS2021_UCSF-PDGMv1
Notes about Image Registration:
- Transformation matrices DICOM to NIfTI are not available.
- Segmentation task image volume have been set to x=y=240 voxels by z=155 voxels.
- All Radiogenomics Classifier task files are restored to original DICOM resolution & orientation (thus volume may vary).
Citations & Data Usage Policy
Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution should include references to the following citations:
Data Citation
Baid, U., Ghodasara, S., Mohan, S., Bilello, M., Calabrese, E., Colak, E., Farahani, K., Kalpathy-Cramer, J., Kitamura, F. C., Pati, S., Prevedello, L., Rudie, J., Sako, C., Shinohara, R., Bergquist, T., Chai, R., Eddy, J., Elliott, J., Reade, W., Schaffter, T., Yu, T., Zheng, J., Davatzikos, C., Mongan, J., Hess, C., Cha, S., Villanueva-Meyer, J., Freymann, J. B., Kirby, J. S., Wiestler, B., Crivellaro, P., Colen, R. R., Kotrotsou, A., Marcus, D., Milchenko, M., Nazeri, A., Fathallah-Shaykh, H., Wiest, R., Jakab, A., Weber, M-A., Mahajan, A., Menze, B., Flanders, A E., Bakas, S., (2023) RSNA-ASNR-MICCAI-BraTS-2021 Dataset. The Cancer Imaging Archive DOI: 10.7937/jc8x-9874
Acknowledgement
"The results <published or shown> here are in whole or part based upon data generated by the TCGA Research Network: http://cancergenome.nih.gov/."
Publication Citation
1. Baid, U., Ghodasara, S., Mohan, S., Bilello, M., Calabrese, E., Colak, E., Farahani, K., Kalpathy-Cramer, J., Kitamura, F. C., Pati, S., Prevedello, L. M., Rudie, J. D., Sako, C., Shinohara, R. T., Bergquist, T., Chai, R., Eddy, J., Elliott, J., Reade, W., Schaffter, T., Yu, T., Zheng, J., Moawad, A. W., Coelho, L. O., McDonnell, O., Miller, E., Moron, F. E., Oswood, M. C., Shih, R. Y., Siakallis, L., Bronstein, Y., Mason, J. R., Miller, A. F., Choudhary, G., Agarwal, A., Besada, C. H., Derakhshan, J. J., Diogo, M. C., Do-Dai, D D., Farage, L., Go, J. L., Hadi, M., Hill, V. B., Iv, M., Joyner, D., Lincoln, C., Lotan, E., Miyakoshi, A., Sanchez-Montano, M., Nath, J., Nguyen, X. V., Nicolas-Jilwan, M., Ortiz Jimenez, J., Ozturk, K., Petrovic, B. D., Shah, C., Shah, L. M., Sharma, M., Simsek, O., Singh, A. K., Soman, S., Statsevych, V., Weinberg, B. D., Young, R. J., Ikuta, I., Agarwal, A. K.,Cambron, S. C., Silbergleit, R., Dusoi, A., Postma, A. A., Letourneau-Guillon, L., Guzman Perez-Carrillo, G. J., Saha, A., Soni, N., Zaharchuk, G., Zohrabian, V. M., Chen, Y., Cekic, M. M., Rahman, A., Small, J. E., Sethi, V., Davatzikos, C., Mongan, J., Hess, C., Cha, S., Villanueva-Meyer, J., Freymann, J. B., Kirby, J. S., Wiestler, B., Crivellaro, P., Colen, R. R., Kotrotsou, A., Marcus, D., Milchenko, M., Nazeri, A., Fathallah-Shaykh, H., Wiest, R., Jakab, A., Weber, M-A. Mahajan ,A., Menze, B., Flanders, A. E., Bakas, S. (2021). The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification (Version 2). arXiv. DOI: 10.48550/arXiv.2107.02314
You are free to use and/or refer to the BraTS datasets in your own research, provided that you always cite the flagship manuscript above resulting from the challenge as well as the following two manuscripts:
Publication Citation
2. Menze, B. H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, J., Burren, Y., Porz, N., Slotboom, J., Wiest, R., Lanczi, L., Gerstner, E., Weber, M.-A., Arbel, T., Avants, B. B., Ayache, N., Buendia, P., Collins, D. L., Cordier, N., … Van Leemput, K. (2015). The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). In IEEE Transactions on Medical Imaging (Vol. 34, Issue 10, pp. 1993–2024). Institute of Electrical and Electronics Engineers (IEEE). DOI: 10.1109/tmi.2014.2377694
Publication Citation
3. Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J. S., Freymann, J. B., Farahani, K., & Davatzikos, C. (2017). Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. In Scientific Data (Vol. 4, Issue 1). https://doi.org/10.1038/sdata.2017.117
TCIA Citation
Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. In Journal of Digital Imaging (Vol. 26, Issue 6, pp. 1045–1057). Springer Science and Business Media LLC. https://doi.org/10.1007/s10278-013-9622-7
Additional Publication Resources:
The Collection authors suggest the below will give context to this dataset:
You are free to use and/or refer to the BraTS datasets in your own research. In addition, please be specific and also cite the following datasets that were part of this Challenge:
- Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J., Freymann, J., Farahani, K., & Davatzikos, C. (2017). Segmentation Labels for the Pre-operative Scans of the TCGA-GBM collection [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2017.KLXWJJ1Q
- Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J., Freymann, J., Farahani, K., & Davatzikos, C. (2017). Segmentation Labels for the Pre-operative Scans of the TCGA-LGG collection [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2017.GJQ7R0EF
- Scarpace, L., Mikkelsen, T., Cha, S., Rao, S., Tekchandani, S., Gutman, D., Saltz, J. H., Erickson, B. J., Pedano, N., Flanders, A. E., Barnholtz-Sloan, J., Ostrom, Q., Barboriak, D., & Pierce, L. J. (2016). The Cancer Genome Atlas Glioblastoma Multiforme Collection (TCGA-GBM) (Version 4) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2016.RNYFUYE9
- Pedano, N., Flanders, A. E., Scarpace, L., Mikkelsen, T., Eschbacher, J. M., Hermes, B., Sisneros, V., Barnholtz-Sloan, J., & Ostrom, Q. (2016). The Cancer Genome Atlas Low Grade Glioma Collection (TCGA-LGG) (Version 3) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2016.L4LTD3TK
- Calabrese, E., Villanueva-Meyer, J., Rudie, J., Rauschecker, A., Baid, U., Bakas, S., Cha, S., Mongan, J., & Hess, C. (2022). The University of California San Francisco Preoperative Diffuse Glioma MRI (UCSF-PDGM) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.bdgf-8v37
- Bakas, S., Sako, C., Akbari, H., Bilello, M., Sotiras, A., Shukla, G., Rudie, J. D., Flores Santamaria, N., Fathi Kazerooni, A., Pati, S., Rathore, S., Mamourian, E., Ha, S. M., Parker, W., Doshi, J., Baid, U., Bergman, M., Binder, Z. A., Verma, R., … Davatzikos, C. (2021). Multi-parametric magnetic resonance imaging (mpMRI) scans for de novo Glioblastoma (GBM) patients from the University of Pennsylvania Health System (UPENN-GBM) (Version 2) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.709X-DN49
Other Publications Using This Data
TCIA maintains a list of publications which leverage our data. If you have a manuscript you'd like to add please contact TCIA's Helpdesk.
Version 1 (Current): Updated 2023/08/25
Data Type | Download all or Query/Filter | License |
---|---|---|
Challenge data (both tasks, 142 GB, *.nii.gz or *.dcm) | (Download and apply the IBM-Aspera-Connect plugin to your browser to retrieve this faspex package) | |
ID Crosswalk between BraTS ID and TCIA ID (xlsx, 79 kB) |