Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCI2950-C Genomes, Networks, and Cancer

Similar presentations


Presentation on theme: "CSCI2950-C Genomes, Networks, and Cancer"— Presentation transcript:

1 CSCI2950-C Genomes, Networks, and Cancer

2 Course Organization Seminar style Grading: CS requirements:
10% participation 30% paper reviews (~10) 20% presentations (2-3) 40% project. Midterm proposal and final presentations Extra credit: 5% for report on comp. bio. seminars (pre-arranged): Maximum 2 CS requirements: PhD: Area B (Algorithms) ScM: Theory or Practice* *depending on project Significant programming*

3 Biology 101

4 Biology 101 Central Dogma

5 What can we measure? Sequencing (expensive) Hybridization (noisy)
Mass spectrometry (noisy) Hybridization (very noisy!)

6 Human Genome Sequenced 2000-2003, ???
Comparative Genomics Genomes of other mammals 2) Disease genomics HapMap 1000 Genomes 3) Cancer genomics The Cancer Genome Atlas Tumor Sequencing Project Sept. 3 “In the Genome Race, the Sequel Is Personal” Now what?

7

8 Course Topics Genomes Algorithms Visualization Machine Learning
Systems Data Mining Networks Cancer

9 Course Topics Genomes Networks Cancer

10 DNA sequencing How we obtain the sequence of nucleotides of a species
…ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC TGATTTTAAAAAAATATT…

11 DNA Sequencing Goal: Find the complete sequence of A, C, G, T’s in DNA
Challenge: There is no machine that takes long DNA as an input, and gives the complete sequence as output Can only sequence ~500 letters at a time

12 Shotgun Sequencing Get one or two reads from each segment
genomic segment cut many times at random (shotgun) Get one or two reads from each segment No technology yet for reading a long strand of DNA in it’s entirety. Instead fragment. New tech. prodcuing shorter reads. ~500 bp ~500 bp

13 Fragment Assembly Cover region with ~7-fold redundancy
reads Put all pieces back together. Great problem for CS. One formulation as SCS. Cover region with ~7-fold redundancy Overlap reads and extend to reconstruct the original genomic region

14 Comparative Genomics Compare genome sequences of related species.
Sequence Alignment Phylogeny: defining ancestral relationships between species, genomes, genes. Sequences Evolve via substitutions Conservation suggests function mouse human EFTPPVQAAYQKVVAG DFNPNVQAAFQKVVAG

15 History of Chromosome X
Rat Consortium, Nature, 2004

16 Comparative Genomic Architectures: Mouse vs Human Genome
Humans and mice have similar genomes, but their genes are ordered differently ~245 rearrangements Reversals Fusions Fissions Translocation

17 Genome rearrangements
Mouse (X chrom.) Unknown ancestor ~ 75 million years ago Human (X chrom.) What are the similarity blocks and how to find them? What is the architecture of the ancestral genome? What is the evolutionary scenario for transforming one genome into the other?

18 Reversals 1 3 2 4 10 5 6 8 9 7 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Blocks represent conserved genes.

19 Reversals 1 2 3 9 10 8 4 7 5 6 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 Blocks represent conserved genes. In the course of evolution or in a clinical context, blocks 1,…,10 could be misread as 1, 2, 3, -8, -7, -6, -5, -4, 9, 10.

20 Reversals and Breakpoints
1 2 3 9 10 8 4 7 5 6 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 The reversion introduced two breakpoints (disruptions in order).

21 Sorting Permutations by Reversals
(Sankoff et al.1990)  = 12…n signed permutation Reversal (i,j) [inversion] 1…i-1 -j ... -i j+1…n Problem: Given , find a sequence of reversals 1, …, t with such that:  . 1 . 2 … t = (1, 2, …, n) and t is minimal. Solution: Analysis of breakpoint graph Polynomial time algorithms O(n4) : Hannenhalli and Pevzner, O(n2) : Kaplan, Shamir, Tarjan, 1997. O(n) [distance t] : Bader, Moret, and Yan, O(n3) : Bergeron, 2001.

22 Course Topics Genomes Networks Cancer Network Alignment
Network Integration

23 Biological Interaction Networks
Many types: Protein-DNA (regulatory) Protein-metabolite (metabolic) Protein-protein (signaling) RNA-RNA (regulatory) Genetic interactions (gene knockouts)

24 Regulatory Networks

25 Cis-regulatory Network

26 Metabolic Networks Nodes = reactants Edges = reactions
labeled by enzyme (protein) that catalyzes reaction

27 Protein-Protein Interaction (PPI) Network

28 Protein-Protein Interaction Network?
Proteins are nodes Interactions are edges Edges may have weights Yeast PPI network H. Jeong et al. Lethality and centrality in protein networks. Nature 411, 41 (2001)

29 Network Alignment Sharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp , 2006

30 The Network Alignment Problem
Given: k different interaction networks belonging to different species, Find: Conserved sub-networks within these networks Conserved defined by protein sequence similarity (node similarity) and interaction similarity (network topology similarity)

31 Protein Signaling Networks
Art Salomon Biology Department Use machine learning methods (Bayesian networks, etc. to derive network structure.

32 Computational Problems
Classifying Network Topology Finding paths, cliques, dense subnetworks, etc. Comparing Networks Across Species Using networks to explain data Dependencies revealed by network topology Modeling dynamics of networks

33 Course Topics Genomes Networks Cancer Cancer Genomes Cancer & Networks

34 Cell Division and Mutation
Single nucleotide change A major contributor to the development of cancer are somatic mutations that occur during cell division Will focus on structural and later copy number, which is not to say that single are not as important. What is the effect of structural changes Copy number Structural

35 Cancer Genomes Leukemia Breast

36 Rearrangements in Tumors
Change gene structure, create novel fusion genes Same translocation as previous slide. 30 years from rearrangement to Gleevec Gleevec (Novartis 2001) targets ABL-BCR fusion

37 Challenges What is detailed organization of cancer genomes?
What genes affected? What sequence of rearrangements produced this genome? Can we create custom treatments for tumors based on mutational spectrum? (e.g. Gleevec) Maybe time for a “Cancer Genome Project”??

38 Tumor Genome → Phenotype Gene Networks and Pathways
Integration of Multiple Data Sources ESP and copy number Mutation Expression (mRNA and miRNA) Binding (ChIP-chip) Pathways Epigenetics activation repression Duplicated genes Deleted genes

39 Tumor Genome → Phenotype Gene Networks and Pathways
Integration of Multiple Data Sources ESP and copy number Mutation Expression (mRNA and miRNA) Binding (ChIP-chip) Pathways Epigenetics activation repression Duplicated genes Deleted genes

40 Course Topics Genomes Algorithms Visualization Machine Learning
Systems Data Mining Networks Cancer


Download ppt "CSCI2950-C Genomes, Networks, and Cancer"

Similar presentations


Ads by Google