Datasets & Benchmarks
Curated cancer datasets and benchmarking resources.
TCGA Datasets
TCGA-BRCA
- Description: Breast Invasive Carcinoma
- Samples: 1,098 primary tumors
- Data Types: WGS, WXS, RNA-seq, miRNA, Methylation
- Download: GDC Portal
TCGA-LUAD
- Description: Lung Adenocarcinoma
- Samples: 1,185 primary tumors
- Data Types: WGS, WXS, RNA-seq, miRNA, Methylation
- Download: GDC Portal
TCGA-COAD
- Description: Colon Adenocarcinoma
- Samples: 521 primary tumors
- Data Types: WGS, WXS, RNA-seq, miRNA, Methylation
- Download: GDC Portal
Benchmark Datasets
MSK-IMPACT
- Description: Targeted sequencing panel
- Samples: 10,000+ patients
- Data Types: Targeted sequencing
- Download: cBioPortal
GENIE
- Description: AACR Project GENIE
- Samples: 100,000+ patients
- Data Types: Clinical + genomic
- Download: GENIE Portal
Single Cell Datasets
Human Cell Atlas
- Description: Single-cell reference atlas
- Samples: 1M+ cells
- Data Types: scRNA-seq, scATAC-seq
- Download: HCA Portal
Imaging Datasets
TCIA Collections
- Description: Medical imaging collections
- Samples: 100,000+ images
- Data Types: CT, MRI, PET, Pathology
- Download: TCIA Portal
Data Standards
- DICOM - Medical imaging
- BAM/SAM - Sequence alignment
- VCF - Variant calling
- MAF - Mutation annotation
- GFF/GTF - Genome annotation