Presentation is loading. Please wait.

Presentation is loading. Please wait.

TCGA The Cancer Genome Atlas Project January 24, 2008.

Similar presentations


Presentation on theme: "TCGA The Cancer Genome Atlas Project January 24, 2008."— Presentation transcript:

1 TCGA The Cancer Genome Atlas Project January 24, 2008

2 TCGA Program Goal: find genomic alterations that cause cancer (mutations, CNA, methylation, …) Pilot project –$100M (NCI/NHGRI) –3 years –3 diseases brain (glioblastoma multiforme) lung (squamous) ovarian (serous cystadenocarcinoma )

3 TCGA Organization Biospecimen Core Resource (BCR) Genome Sequencing Centers (GSCs) (3) Cancer Genome Characterization Centers (CGCCs) (7) Data Coordinating Center (DCC) Project Team (NCI/NHGRI) Steering Committee (NCI/NHGRI & PIs) External Scientific Committee Working Groups

4 TCGA PI’s BCRIGC/TGENRobert Penny GSCBaylorRichard Gibbs BroadEric Lander WashURick Wilson CGCCBroad/DFCIMatthew Meyerson Harvard/B&WRaju Kucherlapati JHUSteve Baylin LBLJoe Gray MSKCCMarc Ladanyi StanfordRick Myers UNCChuck Perou DCCSRAAri Kahn

5 TCGA URLs project site: http://cancergenome.nih.govhttp://cancergenome.nih.gov gforge: http://gforge.nci.nih.gov (search for TCGA)http://gforge.nci.nih.gov data: http://tcga-data.nci.nih.govhttp://tcga-data.nci.nih.gov portal: http://tcga-portal.nci.nih.gov [coming]http://tcga-portal.nci.nih.gov

6 TCGA Data Types InstitutionAnalysisPlatform Broad/DFCITranscription and Copy Number Affymetrix U133 Plus 2.0 & SNP Array 6.0 Harvard/B&WTranscription and Copy Number Agilent 244K Array LBLTranscriptionAffymetrix Exon 1.0 ST Array MSKCCCopy NumberAgilent 244K Array JHUMethylationIllumina GoldenGate UNCTranscriptionAgilent 44K Array StanfordCopy NumberIllumina Infinium 550K BeadChip Array BroadSomatic MutationsDNA sequencing BaylorSomatic MutationsDNA sequencing WashUSomatic MutationsDNA sequencing

7 TCGA Data Levels raw –low-level data for a single sample, not normalized (e.g., trace file,.cel file) processed –single-sample, normalized & interpreted (e.g. mutation call, amplification call for a locus,.snp,.chp) segmented (n/a for mutation & expression) –single-sample, aggregation of loci into regions (e.g. amplification call for a region of a sample) summary finding (aka “region of interest”) –cross-sample findings (e.g. minimal common region of amplification across a sample set)

8 TCGA Flow Tissue Source (MD Anderson, Henry Ford, …) BCR 1.check pathology, quality/quantity 2.extract analytes 3.prepare data file GSC WGACGCC DNA, mRNA DNA NCBI Trace Archive DCC sample data Bulk Download caTissue Core caArraycaIntegrator “tracking database”

9 TCGA Data Formats BCR –XML (tags are CDEs) –images GSC –Called mutations (Genboree LFF format) –Linking table sample-trace-target CGCC –MAGE-TAB IDF: Investigation Definition Format SDRF: Sample and Data Relationship Format

10 TCGA Where Does/Will the Data Go? ftp site (now with a simple web wrapper: “portal #1”) “tracking database” repositories with caBIG API’s –caArray –caTissue CORE –caIntegrator –NCIA NCBI trace archive a richer, “portal #2” –more convenient download capability –filtering datasets by clinical information –summary level data –genome browser view –gene info page –visualization on pathways –etc.


Download ppt "TCGA The Cancer Genome Atlas Project January 24, 2008."

Similar presentations


Ads by Google