Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparative Biology observable Parameters:time rates, selection Unobservable Evolutionary Path observable Most Recent Common Ancestor ? ATTGCGTATATAT….CAG.

Similar presentations


Presentation on theme: "Comparative Biology observable Parameters:time rates, selection Unobservable Evolutionary Path observable Most Recent Common Ancestor ? ATTGCGTATATAT….CAG."— Presentation transcript:

1 Comparative Biology observable Parameters:time rates, selection Unobservable Evolutionary Path observable Most Recent Common Ancestor ? ATTGCGTATATAT….CAG Time Direction Which phylogeny? Which ancestral states? Which process? Key Questions: Homologous objects Co-modelling Genealogical Structures? Key Generalisations:

2 Structure of Biology: Physical Systems and Evolution Data Sequences Structures Expression Levels …. … Data M 1.. M k Models Framework for model formulation Models Scientific Texts, Systems Biology Markup Language, Process Algebras … Knowledge and Representation Knowledge & Representation Structure of Biological Systems Atoms, Molecules, Networks, Motors Central Dogma, Genetic Code … Structure of Biological Systems Dynamics - the system as a physical entity Evolution - the system has evolved Part of individuals in a population Part of species in the tree of life

3 The Data Sequence Data Metabonomics/Metabolomics and Small Molecule Detection Expression Data Proteomics and Protein Interactions Structures from Crystallography, NMR and Cryo-EM Single Molecule Measurements Microscopy

4 Example of Reduction/Levels Enzyme catalysis: Such reductions can are based on “biological concepts” A molecular dynamics sample path involving one catalysis event: Set of E + S initial states ES states? Set of E + P final states 10 9 time steps 10 4 atoms Discrete models of one catalysis event: E + S  ES  E + P 3-5 steps reduction Other clear reductions: Individual molecules Concentration of molecules Set of atoms Nucleotide lipid molecules Membrane

5 Elements of Physical Dynamic Modeling Time Continuous Time Discrete Time 0 12 k No Time - Equilibrium State & Space Continuous Space Discrete Space No Space or Space Homogeneity Time/Space dependency Discrete Time 0 1k-i k-1 k Deterministic Stochastic p0p0 p1p1 p2p2 p3p3 Discrete Time Continuous Time Complicated & contentious.

6 Physical Dynamic Modeling: Key Models Molecular Dynamics Quantum Mechanics Classical Potential Continuous Time Markov Chains/ Gillespie Algorithm Ordinary Differential Equations - ODE Partial Differential Equations - PDE (Turing Model) Stochastic Ordinary Differential Equations - SODE Stochastic Partial Differential Equations - SPDE Models on Networks Boolean Networks Kinetic Models

7 Elusive Biological Concepts: Emergence Other EBCs: function, robustness, modularity, purpose, top-down, downward causation. Strong emergence: (never observed) The dynamic laws for k components are not deducible from their properties and their relationships. Lower level High dimensional detailed description Higher level Low dimensional “Surprising” stable, robust properties Reduction Weak emergence: something “new” emerges. Questions : Automatic detection of emergence? How frequent is it? Does selection pull out emergent systems? Ex.1 Network Dynamics Oscillations, sensitive amplification Large set of enzymes and atoms Ex.2 Neural Networks Ability to calculate, consciousness Large set of cells

8 Levels & Objects

9 How to Compare? Examples Protein Structures NetworksCraniums/Shape Homologous - Non-Homologous? Homologous components A C G T A - T T Matching - Similarity - Distance Distance from shortest paths The ideal: The probability of 1 observation * Summing over possible evolutionary trajectories to the second observation. Informal A set: AGT ACCT P( ) A pair:

10 “Natural” Evolutionary Modeling Components: Birth and Death Process. Components are born with rate and die with rate . Discrete states: Continuous Time Finite States Markov Chains. Initially all rates the same. p0p0 p1p1 p2p2 p3p3 Continuous states: Continuous Time Continuous States Markov Process - specifically Diffusion. Initially simplest Diffusion: Brownian Motion, then Ornstein-Uhlenbeck.

11 Comparative Biology Nucleotides/Amino Acids Continuous Quantities Sequences Gene Structure Structure RNA Protein Networks Metabolic Pathways Protein Interaction Regulatory Pathways Signal Transduction Macromolecular Assemblies Motors Shape Patterns Tissue/Organs/Skeleton/ …. Dynamics MD movements of proteins Locomotion Culture Language Vocabulary Grammar Phonetics Semantics Observed or predicted? Choice of Representation.

12 Comparative Biology: Evolutionary Models Nucleotides/Amino Acids/codons CTFS continuous time finite state Jukes-Cantor 69 +500 other Continuous Quantities CTCS Felsenstein 68 + 50 other Sequences CT countable S Thorne, Kishino Felsenstein,91 + 40 Gene Structure Matching DeGroot, 07 Genome Structure CTCS MM Structure RNA SCFG-model like Holmes, I. 06 + few others Protein Networks CT countable S Snijder, T Metabolic Pathways Protein Interaction Regulatory Pathways Signal Transduction Macromolecular Assemblies Motors I Shape Patterns Tissue/Organs/Skeleton/ …. Dynamics MD movements of proteins Locomotion Culture Language Vocabulary “ Infinite Allele Model ” (CTCS) Swadesh,52, Sankoff,72, … Grammar - Phonetics Semantics Phenotype ObjectTypeReference

13 “Natural” Co-Modeling Joint evolutionary modeling of X(t),Y(t). The ideal, rarely if ever done. Conditional evolutionary modeling of X(t) given Y(t). The standard in comparative genomics. The distribution of Y(t) is not derived from evolution, but from practicality. Protein Gene Prediction RNA structure prediction Regulatory signal prediction. Y(t) deterministic function of X(t) Movement of proteins Protein Structures

14 Examples RNA structure prediction Comparative Genomics Networks Patterns Protein Structures

15 Structure Dependent Molecular Evolution RNA Secondary Structure From Durbin et al.(1998) Biological Sequence Comparison Secondary Structure : Set of paired positions. A-U + C-G can base pair. Some other pairings can occur + triple interactions exists. Pseudoknot – non nested pairing: i < j < k < l and i-k & j-l.

16 Simple String Generators Context Free Grammar  S--> aSa bSb aa bb One sentence (even length palindromes): S--> aSa --> abSba --> abaaba Variables (capital) Letters (small) Regular Grammar: Start with S S --> aT bS T --> aS bT  One sentence – odd # of a’s: S-> aT -> aaS –> aabS -> aabaT -> aaba Regular Context Free

17 Stochastic Grammars The grammars above classify all string as belonging to the language or not. All variables has a finite set of substitution rules. Assigning probabilities to the use of each rule will assign probabilities to the strings in the language. S -> aSa -> abSba -> abaaba i. Start with S. S --> (0.3)aT (0.7)bS T --> (0.2)aS (0.4)bT (0.2)  If there is a 1-1 derivation (creation) of a string, the probability of a string can be obtained as the product probability of the applied rules. S -> aT -> aaS –> aabS -> aabaT -> aaba ii.  S--> (0.3)aSa (0.5)bSb (0.1)aa (0.1)bb *0.3 *0.2 *0.7 *0.3 *0.2 *0.5 *0.1

18 S --> LS L.869.131 F --> dFd LS.788.212 L --> s dFd.895.105 Secondary Structure Generators

19 Knudsen & Hein, 2003 From Knudsen & Hein (1999) RNA Structure Application

20 Co-Modelling and Conditional Modelling Observable Unobservable Goldman, Thorne & Jones, 96 U C G A C A U A C Knudsen.., 99 Eddy & co. Meyer and Durbin 02 Pedersen …, 03 Siepel & Haussler 03 Pedersen, Meyer, Forsberg…, Simmonds 2004a,b McCauley …. Firth & Brown Conditional Modelling Needs: Footprinting -Signals (Blanchette) AGGTATATAATGCG..... P coding {ATG-->GTG} or AGCCATTTAGTGCG..... P non-coding {ATG-->GTG}

21 Network Evolution Statistics of Networks Comparing Networks Networks in Cellular Biology A. Metabolic Pathways B. Regulatory Networks C. Signaling Pathways D. Protein Interaction Networks - PIN Empirical Facts Dynamics on Networks (models) Models of Network Evolution

22 A Model for Network Inference A core metabolism: A given set of metabolites: A given set of possible reactions - arrows not shown. A set of present reactions - M black and red arrows Let  be the rate of deletion  the rate of insertion  Then Restriction R: A metabolism must define a connected graph M + R defines 1. a set of deletable (dashed) edges D(M): 2. and a set of addable edges A(M):

23 Likelihood of Homologous Pathways Number of Metabolisms: 1 2 3 4 + 2 symmetrical versions P  (, )=P  ( )P  ( -> ) Eleni Giannoulatou Approaches: Continuous Time Markov Chains with computational tricks. MCMC Importance Sampling

24 PIN Network Evolution Barabasi & Oltvai, 2004 & Berg et al.,2004; Wiuf etal., 2006 A gene duplicates Inherits it connections The connections can change Berg et al.,2004: Gene duplication slow ~10 -9 /year Connection evolution fast ~10 -6 /year Observed networks can be modeled as if node number was fixed.

25 Likelihood of PINs Can only handle 1 graph. Limited Evolution Model de-DAing De-connecting Data 2386 nodes and 7221 links Irreducible (and isomorphic) 735 nodes Wiuf etal., 2006

26 The Phylogenetic Turing Patterns I

27 Stripes: p small Spots: p large The Phylogenetic Turing Patterns II Reaction-Diffusion Equations: Analysis Tasks: 1.Choose Class of Mechanisms 2. Observe Empirical Patterns 3.Choose Closest set of Turing Patterns T 1, T 2,.., T k, 4.Choose parameters p 1, p 2,.., p k (sets?) behind T 1,.. Evolutionary Modelling Tasks: 1. p(t 1 )-p(t 2 ) ~ N(0, (t 1 -t 2 )  ) 2. Non-overlapping intervals have independent increments I.e. Brownian Motion Scientific Motivation: 1.Is there evolutionary information on pattern mechanisms? 2. How does patterns evolve?

28 Known Unknown  -globin Myoglobin 300 amino acid changes 800 nucleotide changes 1 structural change 1.4 Gyr ? ? ? ? 1. Given Structure what are the possible events that could happen? 2. What are their probabilities? Old fashioned substitution + indel process with bias. Bias: Folding(Sequence  Structure) & Fitness of Structure 3. Summation over all paths. Protein Structure

29 Summary: The Virtues of Comparative Modeling It is the natural setup for much modeling and transfer of knowledge from one species/system to another. Even 1 system/species is an evolutionary observation: x P(x): P(Further history of x): x U C G A C A U A C


Download ppt "Comparative Biology observable Parameters:time rates, selection Unobservable Evolutionary Path observable Most Recent Common Ancestor ? ATTGCGTATATAT….CAG."

Similar presentations


Ads by Google