Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Overview of The Cancer Genome Atlas (TCGA)

Similar presentations


Presentation on theme: "An Overview of The Cancer Genome Atlas (TCGA)"— Presentation transcript:

1 An Overview of The Cancer Genome Atlas (TCGA)
Maxwell Lee National Cancer Institute Center for Cancer Research Laboratory of Cancer Biology and Genetics High-dimension Data Analysis Group January 7, 2016

2 Outline Of The Talk A brief history of TCGA Overview of TCGA data
TCGA data access policy and download Some examples of data analyses Discussion of relevant TCGA publications

3 History And Timeline Of Human Genome Science
Human genome project 1990 initiation 2000 draft sequence 2003 complete sequence other genome projects 2003 HapMap project 2003 ENCODE project Genomes Project TCGA 2005 pilot project announced 2009 transition to phase II 2014 end

4 History And Timeline Of TCGA
Dec 13, 2005 TCGA pilot project announced 2008 TCGA published glioblastoma paper 2009 TCGA transition to phase II 2011 TCGA published ovarian cancer paper 2014 TCGA ends

5

6

7

8

9 Major TCGA Research Components
Biospecimen Core Resource (BCR) Collect and process tissue samples Genome Sequencing Centers (GSCs) Use high-throughput Genome Sequencing to identify the changes in DNA sequences in cancer Genome Characterization Centers (GCCs) Analyze genomic and epigenomic changes involved in cancer Proteome Characterization Centers (PCCs) Analyze the proteomic content of a subset of TCGA samples Data Coordinating Center (DCC) The TCGA data are centrally managed at the DCC Cancer Genomics Hub (CGHub) This database stores cancer genome sequences and alignments. Genome Data Analysis Centers (GDACs) These centers provide informatics tools to facilitate broader use of TCGA data.

10 https://wiki.nci.nih.gov/display/TCGA/Introduction+to+TCGA

11 TCGABarcode

12 TCGA Sample Code Mention about code 01, 10, 11, 03, 06

13 TCGA Data Access Policy
An access control policy is in place for TCGA data to ensure that personally identifiable information is kept from unauthorized users. Open access - Houses data that cannot be aggregated to generate a data set unique to an individual. This tier does not require user certification for data access. Controlled access - Houses individually-unique information that could potentially be used to identify an individual. This tier requires user certification for data access.

14 TCGA Data Levels

15

16

17 TCGA Controlled Access Data
Access to controlled data is available to researchers who: Agree to restrict their use of the information to biomedical research purposes only Agree with the statements within TCGA Data Use Certification (DUC) Have their institutions certifiably agree to the statements within TCGA DUC Complete the Data Access Request (DAR) form and submit it to the Data Access Committee to be a TCGA Approved User. This form is available electronically through dbGaP.

18 TCGA Controlled Access Data

19 An approved user can request to add downloaders

20

21 Where to download the data?
TCGA Data Portal GDAC at Broad Institute cBioPortal The Cancer Genomics Hub (CGHub)

22 https://confluence.broadinstitute.org/display/GDAC/Dashboard-Stddata

23 https://confluence.broadinstitute.org/display/GDAC/Dashboard-Analyses

24 Download TCGA Data Using Broad GDAC Firehose
wget unzip firehose_get_latest.zip ./firehose_get ./firehose_get stddata latest ./firehose_get analyses latest ./firehose_get stddata latest LUAD LUSC #Downloaded: 250 files, 6.3G in 29m 54s (3.62 MB/s) ./firehose_get analyses latest BRCA OV #Downloaded: 312 files, 788M in 1h 18m 40s (171 KB/s)

25

26 IGV views of structural changes of recurrent SVs in MACROD2
Gistic2 analysis of TCGA gastric cancer data of 441 STAD tumor samples showed that FHIT, MACROD2, and PARK2 were in the 6th, 7th, and 12th most significantly deleted regions Hu et al Cancer Res, accepted

27 An Algorithm For Methylation And Expression Index (MEI)
Illumina Infinium HumanMethylation27 BeadChip Illumina HumanRef-8 v2 Expression BeadChip Differential methylation based on IHC (positive vs. negative for ER, PR, Her2, EGFR, or CK5) 2227 methylation markers in 1162 genes Top 3% most variable gene expression 541 genes 128 methylation markers in 65 genes MEI: the weighted sum of the gene expression where the weights are the negative numbers of the Spearman correlations. Figueroa JD, Yang H et al. Breast Cancer Res Treat. 2015

28 Polish dataset: K-M survival using MEI for ER+ and ER- samples
ER+ cases ER- cases Survival Probability Survival Probability p = 0.009 p = 0.360 Year Year

29 Validation: K-M survival using MEI for ER+ samples
TCGA ER+ GSE6532 ER+ p = 0.001 p = 0.001 Year Year OS DMFS Survival Probability OS NKI ER+ METABRIC ER+ p = 0.004 p = Year Year

30 TP53 Missense Mutations Associate With High TP53 Protein Levels

31 Correlation Between Gene Expression And DNA Methylation

32

33

34 Figure 4

35

36

37 CpG Island Methylator Phenotype Of Glioblastoma
Noushmehr et al. Cancer Cell 2010; 17(5): 510–522.

38 Figure 2

39 IDH1/2 And TET2 Mutations Are Mutually Exclusive
AML from Eastern Cooperative Oncology Group’s (ECOG) E1900 clinical trial Figueroa et al. Cancer Cell 18, 553–567, 2010

40 IDH Mutations Increase DNA Methylation Pathway

41 DNA Methylation Pathway
Shih et al. Nat Rev Cancer Sep;12(9):

42

43

44

45

46

47 Cluster-of-cluster Assignments (COCA) Of The Pan-cancer-12 Tumors

48


Download ppt "An Overview of The Cancer Genome Atlas (TCGA)"

Similar presentations


Ads by Google