Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tumor Genome Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST512.

Similar presentations


Presentation on theme: "Tumor Genome Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST512."— Presentation transcript:

1 Tumor Genome Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST512

2 Cancer Cancer will affect 1 in 2 men and 1 in 3 women in the United States, and the number of new cases of cancer is set to nearly double by the year 2050. Cancer is a genetic disease caused by mutations in the DNA Clinically tumors can look the same but most differ genetically. 2

3 Different Sequencing Approaches Capture-seq ($400-600) –Could focus well known mutations Exome-seq ($700-2K) –All the exons in genes; promoters and LncRNA genes? RNA-seq ($500-2K) –Expression and mutations together, miss anything? Whole genome sequencing ($3-4K) –Majority of mutations non-coding, function unknown –Better at detecting structural changes (translocations, fusions) –Cost-vs-benefit balance 3

4 Two Major Cancer Genome Projects TCGA: The Cancer Genome Atlas (US) –> 30 cancer types and > 10K tumor samples –Primary tumors, fewer death events –Genome, transcriptome, DNA methylome, proteomics –Rigorous tumor sample QC, consistent profiling platform ICGC: International Cancer Genome Consortium (11 countries) –20 cancer types * 500 tumor samples each 4

5 Tumor Gene Expression Microarrays or RNA-seq Data analysis? Differential expression between cancer and normal Cluster the tumor samples into sub-types –Consensus clustering: sampling genes or tumors, get robust clustering Predict patient outcome (survival or recurrence) 5 Break

6 Survival Analysis Do patients receiving the treatment live longer? Are smokers more likely to have cancer currence Censored data: the value of a measurement or observation is only partially known –Some patients left the study –Study concluded 6

7 Survival Without Censoring 7

8 Survival With Censoring 8

9 Kaplan Meier Curve More individuals in each group, better separation of the groups, better p-value 9

10 Log Rank Test 10

11 Log Rank Test 11

12 More Variables 50-signature? Logistic regression: –Estimate odds ratio: ratio of proportions –Linear combination of all the genes to separate outcome (0, 1). Cox Regression –Estimate hazard ratio: ratio of incidence rates –Models the effect of covariates on the hazard rate but leaves the baseline hazard rate unspecified 12

13 Use Cox Regression to Separate Two Groups by Gene Signature 13

14 Caution About Gene Signature’s Predictive Power 14 Break

15 Mutations in the Tumor Genome Help us identify important genes for tumorigenesis and cancer progression Drivers – a.k.a gatekeepers, mutations that cause and accelerate cancers Passengers – Accidental by-products and thwarted DNA-repair mechanisms Recurrent mutations on genes or pathways are likely drivers 15

16 High Throughput Driver Detection Differential gene expression Copy number aberration (CNA) or variation (CNV) using CGH, tiling or SNP arrays 16

17 Comparative genomic hybridization (CGH) 17

18 GISTIC Gscore: frequency of occurrence and the amplitude of the aberration Statistical significance evaluated by permutation FDR adjust for multiple hypothesis testing 18

19 GATK https://www.broadinstitute.org/gatk/guide/best-practices FASTQ-> BAMBAM->VCFAnnotate 19

20 MAF and VCF Formats VCF (GWAS format) and MAF (TCGA format) Both can annotate somatic mutations and germline variants Tab delimited text file CHROM, POS, ID (SNP id, gene symbol, or ENTREZ gene id), REF (reference seq), ALT (altered sequence), QUAL (quality score), FILTER (PASS vs “q10;s50” quality <=10, <=50% samples have data here), INFO (allele counts, total counts, number of samples with data, somatic or not, validated, etc) 20

21 Example of a Cancer Genome Mutations Profile Circos Plot: how messed up a cancer genome is 21

22 Total alterations affecting protein- coding genes in selected tumors Vogelstein et al, Science 2013 22

23 Somatic Mutation Frequency in 3K Tumor-Normal Pairs Typical tumors: median 45 mutations / tumor More mutations for tumors facing outside 23 Break

24 TS vs Oncogenes, GoF vs LoF Tumor suppressors vs oncogenes Gain of Function (GoF) or Loss of Function (LoF) mutations –Phenotypes How to tell? –From mutation patterns –From expression patterns –Functional studies Some genes can be both TS and oncogenes 24

25 Mutation Rate Heterogeneity Mutation rate correlated with replication timing, gene expression, and gene length Tumor evolution and selection 25 Lawrence et al, Nat 2013

26 Recurrent Mutations 26 Known Novel clear cancer assoc Novel Lawrence et al, Nat 2014

27 How Much Should We Sequence? Need ~200 patients for 20% mutation rate, ~550 pts for 10%, ~1200 pts for 5% mutation rate. Most driver mutations have been found, pressing need in basic cancer research to study their function Biggest surprise: mutations on chromatin regulators –> 50% new and strong cancer driver genes –Oncogenes: DNMT3A, IDH1 –Tumor Suppressor: MLL, ATRX, ARID1A, SNF5 –Both: EZH2 Sequencing metastasized or drug resistant tumors might yield insights on tumor progression 27

28 Resources MSKCC CBioPortalCBioPortal –GUI interface for experimental biologists Broad FireHoseFireHose –API for accessing processed TCGA data UCSC CGHubCGHub –API for accessing raw and processed cancer data Sanger COSMICCOSMIC –Catalog of Somatic Mutations in Cancer Many also provide software tools 28

29 Summary Different sequencing approaches Gene Expression, tumor sub-typing Survival analysis: KM vs Cox Regression Different mutation types and distributions Gain or loss of function mutations Tumor suppressor vs oncogenes 29

30 Acknolwedgement Aleksandar Milosavljevic Kristin Sainani Linda Staub & Alexandros Gekenidis Yin Bun Cheung, Paul Yip John Pack Cheng Li Xujun Wang Peng Jiang 30


Download ppt "Tumor Genome Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST512."

Similar presentations


Ads by Google