CSCI2950-C Genomes, Networks, and Cancer

Slides:



Advertisements
Similar presentations
Sorting by reversals Bogdan Pasaniuc Dept. of Computer Science & Engineering.
Advertisements

Greedy Algorithms CS 466 Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix of.
Gene an d genome duplication Nadia El-Mabrouk Université de Montréal Canada.
Integrating Genomes D. R. Zerbino, B. Paten, D. Haussler Science 336, 179 (2012) Teacher: Professor Chao, Kun-Mao Speaker: Ho, Bin-Shenq June 4, 2012.
Rearrangements and Duplications in Tumor Genomes.
Unit 1: DNA and the Genome Key area 8: Genomic sequencing.
Duplication, rearrangement, and mutation of DNA contribute to genome evolution Chapter 21, Section 5.
Bioinformatics Chromosome rearrangements Chromosome and genome comparison versus gene comparison Permutations and breakpoint graphs Transforming Men into.
Greedy Algorithms And Genome Rearrangements
Genome Rearrangements CIS 667 April 13, Genome Rearrangements We have seen how differences in genes at the sequence level can be used to infer evolutionary.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Computational Genomics Lecture 1, Tuesday April 1, 2003.
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
DNA Sequencing. CS273a Lecture 3, Spring 07, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
Introduction to Bioinformatics Algorithms Greedy Algorithms And Genome Rearrangements.
DNA Sequencing. Next few topics DNA Sequencing  Sequencing strategies Hierarchical Online (Walking) Whole Genome Shotgun  Sequencing Assembly Gene Recognition.
Bacterial Physiology (Micr430)
Genome Rearrangements CSCI : Computational Genomics Debra Goldberg
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Genomic Rearrangements CS 374 – Algorithms in Biology Fall 2006 Nandhini N S.
The Central Dogma of Molecular Biology (Things are not really this simple) Genetic information is stored in our DNA (~ 3 billion bp) The DNA of a.
DNA Sequencing. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT.
CS273a Lecture 1, Autumn 10, Batzoglou DNA Sequencing.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
DNA Sequencing. CS273a Lecture 3, Autumn 08, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
Computational Genomics Lecture 1, Tuesday April 1, 2003.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
1 Genome Rearrangements João Meidanis São Paulo, Brazil December, 2004.
DNA Sequencing. Next few topics DNA Sequencing  Sequencing strategies Hierarchical Online (Walking) Whole Genome Shotgun  Sequencing Assembly Gene Recognition.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Lesson Overview 13.1 RNA.
RNA and Protein Synthesis
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Greedy Algorithms And Genome Rearrangements An Introduction to Bioinformatics Algorithms (Jones and Pevzner)
Greedy Algorithms And Genome Rearrangements
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Reconstructing Genomic Architectures of Tumor Genomes Pavel Pevzner and Ben Raphael Department of Computer Science & Engineering University of California,
Chapter 21 Eukaryotic Genome Sequences
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Genome Rearrangements. Turnip vs Cabbage: Look and Taste Different Although cabbages and turnips share a recent common ancestor, they look and taste different.
1 Genome Rearrangements (Lecture for CS498-CXZ Algorithms in Bioinformatics) Dec. 6, 2005 ChengXiang Zhai Department of Computer Science University of.
Chapter 13- RNA and Protein Synthesis
CSCI2950-C Lecture 12 Networks
DNA Sequencing Project
CSCI2950-C Lecture 9 Cancer Genomics
Genomes and Their Evolution
Genomes and Their Evolution
CSE 5290: Algorithms for Bioinformatics Fall 2009
Mutations TSW identify and describe the various types of mutations and their effects.
Greedy (Approximation) Algorithms and Genome Rearrangements
Lecture 3: Genome Rearrangements and Duplications
Today… Review a few items from last class
Genomes and Their Evolution
CSCI2950-C Lecture 4 Genome Rearrangements
Greedy Algorithms And Genome Rearrangements
What is RNA? Do Now: What is RNA made of?
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
Principle of Epistasis Analysis
12.4 Mutations Kinds of Mutations Significance of Mutations.
CSCI 1810 Computational Molecular Biology 2018
Greedy Algorithms And Genome Rearrangements
Unit Genomic sequencing
Academic Biology Notes
Copyright Pearson Prentice Hall
Presentation transcript:

CSCI2950-C Genomes, Networks, and Cancer http://cs.brown.edu/courses/cs2950-c/

Course Organization Seminar style Grading: CS requirements: 10% participation 30% paper reviews (~10) 20% presentations (2-3) 40% project. Midterm proposal and final presentations Extra credit: 5% for report on comp. bio. seminars (pre-arranged): Maximum 2 CS requirements: PhD: Area B (Algorithms) ScM: Theory or Practice* *depending on project Significant programming*

Biology 101

Biology 101 Central Dogma

What can we measure? Sequencing (expensive) Hybridization (noisy) Mass spectrometry (noisy) Hybridization (very noisy!)

Human Genome Sequenced 2000-2003, ??? Comparative Genomics Genomes of other mammals 2) Disease genomics HapMap 1000 Genomes 3) Cancer genomics The Cancer Genome Atlas Tumor Sequencing Project Sept. 3 “In the Genome Race, the Sequel Is Personal” Now what?

Course Topics Genomes Algorithms Visualization Machine Learning Systems Data Mining Networks Cancer

Course Topics Genomes Networks Cancer

DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC TGATTTTAAAAAAATATT…

DNA Sequencing Goal: Find the complete sequence of A, C, G, T’s in DNA Challenge: There is no machine that takes long DNA as an input, and gives the complete sequence as output Can only sequence ~500 letters at a time

Shotgun Sequencing Get one or two reads from each segment genomic segment cut many times at random (shotgun) Get one or two reads from each segment No technology yet for reading a long strand of DNA in it’s entirety. Instead fragment. New tech. prodcuing shorter reads. ~500 bp ~500 bp

Fragment Assembly Cover region with ~7-fold redundancy reads Put all pieces back together. Great problem for CS. One formulation as SCS. Cover region with ~7-fold redundancy Overlap reads and extend to reconstruct the original genomic region

Comparative Genomics Compare genome sequences of related species. Sequence Alignment Phylogeny: defining ancestral relationships between species, genomes, genes. Sequences Evolve via substitutions Conservation suggests function mouse human EFTPPVQAAYQKVVAG DFNPNVQAAFQKVVAG

History of Chromosome X Rat Consortium, Nature, 2004

Comparative Genomic Architectures: Mouse vs Human Genome Humans and mice have similar genomes, but their genes are ordered differently ~245 rearrangements Reversals Fusions Fissions Translocation

Genome rearrangements Mouse (X chrom.) Unknown ancestor ~ 75 million years ago Human (X chrom.) What are the similarity blocks and how to find them? What is the architecture of the ancestral genome? What is the evolutionary scenario for transforming one genome into the other?

Reversals 1 3 2 4 10 5 6 8 9 7 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Blocks represent conserved genes.

Reversals 1 2 3 9 10 8 4 7 5 6 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 Blocks represent conserved genes. In the course of evolution or in a clinical context, blocks 1,…,10 could be misread as 1, 2, 3, -8, -7, -6, -5, -4, 9, 10.

Reversals and Breakpoints 1 2 3 9 10 8 4 7 5 6 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 The reversion introduced two breakpoints (disruptions in order).

Sorting Permutations by Reversals (Sankoff et al.1990)  = 12…n signed permutation Reversal (i,j) [inversion] 1…i-1 -j ... -i j+1…n Problem: Given , find a sequence of reversals 1, …, t with such that:  . 1 . 2 … t = (1, 2, …, n) and t is minimal. Solution: Analysis of breakpoint graph Polynomial time algorithms O(n4) : Hannenhalli and Pevzner, 1995. O(n2) : Kaplan, Shamir, Tarjan, 1997. O(n) [distance t] : Bader, Moret, and Yan, 2001. O(n3) : Bergeron, 2001.

Course Topics Genomes Networks Cancer Network Alignment Network Integration

Biological Interaction Networks Many types: Protein-DNA (regulatory) Protein-metabolite (metabolic) Protein-protein (signaling) RNA-RNA (regulatory) Genetic interactions (gene knockouts)

Regulatory Networks

Cis-regulatory Network

Metabolic Networks Nodes = reactants Edges = reactions labeled by enzyme (protein) that catalyzes reaction

Protein-Protein Interaction (PPI) Network

Protein-Protein Interaction Network? Proteins are nodes Interactions are edges Edges may have weights Yeast PPI network H. Jeong et al. Lethality and centrality in protein networks. Nature 411, 41 (2001)

Network Alignment Sharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp. 427-433, 2006

The Network Alignment Problem Given: k different interaction networks belonging to different species, Find: Conserved sub-networks within these networks Conserved defined by protein sequence similarity (node similarity) and interaction similarity (network topology similarity)

Protein Signaling Networks Art Salomon Biology Department Use machine learning methods (Bayesian networks, etc. to derive network structure.

Computational Problems Classifying Network Topology Finding paths, cliques, dense subnetworks, etc. Comparing Networks Across Species Using networks to explain data Dependencies revealed by network topology Modeling dynamics of networks

Course Topics Genomes Networks Cancer Cancer Genomes Cancer & Networks

Cell Division and Mutation Single nucleotide change A major contributor to the development of cancer are somatic mutations that occur during cell division Will focus on structural and later copy number, which is not to say that single are not as important. What is the effect of structural changes Copy number Structural

Cancer Genomes Leukemia Breast

Rearrangements in Tumors Change gene structure, create novel fusion genes Same translocation as previous slide. 30 years from rearrangement to Gleevec Gleevec (Novartis 2001) targets ABL-BCR fusion

Challenges What is detailed organization of cancer genomes? What genes affected? What sequence of rearrangements produced this genome? Can we create custom treatments for tumors based on mutational spectrum? (e.g. Gleevec) Maybe time for a “Cancer Genome Project”??

Tumor Genome → Phenotype Gene Networks and Pathways Integration of Multiple Data Sources ESP and copy number Mutation Expression (mRNA and miRNA) Binding (ChIP-chip) Pathways Epigenetics activation repression Duplicated genes Deleted genes

Tumor Genome → Phenotype Gene Networks and Pathways Integration of Multiple Data Sources ESP and copy number Mutation Expression (mRNA and miRNA) Binding (ChIP-chip) Pathways Epigenetics activation repression Duplicated genes Deleted genes

Course Topics Genomes Algorithms Visualization Machine Learning Systems Data Mining Networks Cancer