Evolution & Design Principles in Biology: a consequence of evolution and natural selection Rui Alves University of Lleida Course Website:http://web.udl.es/usuaris/pg193845/Bioinformatics_2009/

Slides:



Advertisements
Similar presentations
Evolution of genomes.
Advertisements

. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
THE EVOLUTIONARY HISTORY OF BIODIVERSITY
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Preview What does Recombination do to Sequence Histories. Probabilities of such histories. Quantities of interest. Detecting & Reconstructing Recombinations.
Phylogenetic reconstruction
Duplication, rearrangement, and mutation of DNA contribute to genome evolution Chapter 21, Section 5.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
14 Molecular Evolution and Population Genetics
Finding Orthologous Groups René van der Heijden. What is this lecture about? What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)?
Bioinformatics and Phylogenetic Analysis
Lecture 28 Evolution. Variation Without variation (which arises from mutations of DNA molecules to produce new alleles) natural selection would have nothing.
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
With astonishing advance of the Human Genome Project, essentially all human genomic sequences are available in public databases. The major task for the.
CISC667, F05, Lec16, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (III) Probabilistic methods.
Probabilistic methods for phylogenetic trees (Part 2)
MSA- multiple sequence alignment Aligning many sequences is often preferable to pairwise comparisons. Problem- Computational complexity of multiple alignments.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
es/by-sa/2.0/. Multiple Alignments & Molecular Evolution Prof:Rui Alves Dept Ciencies Mediques.
Molecular phylogenetics
Population GENETICS.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
es/by-sa/2.0/. Multiple Alignments & Molecular Evolution Prof:Rui Alves Dept Ciencies Mediques.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Sources of Inherited Variation Mutations & Sexual Reproduction.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Phylogeny GENE why is coalescent theory important for understanding phylogenetics (species trees)? coalescent theory lets us test our assumptions.
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Introduction to History of Life. Biological evolution consists of change in the hereditary characteristics of groups of organisms over the course of generations.
Lecture 17: Phylogenetics and Phylogeography
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny Ch. 7 & 8.
Hickory Dickory Dock: Understanding the Molecular Clock Felisa Wolfe ERUPT: Biocomplexity Seminar 28 Feb 2003.
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Mutation & genetic variation. Mutations gene – stretch of dna that codes for a distinctive type of rna or protein allele – versions of the same gene.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Modelling evolution Gil McVean Department of Statistics TC A G.
5.4 Cladistics The images above are both cladograms. They show the statistical similarities between species based on their DNA/RNA. The cladogram on the.
Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. By Chris Paine
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are.
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
Evolution as Genetic Change
15-2 Mechanisms of Evolution
Methods of molecular phylogeny
Basics of Genetic Algorithms (MidTerm – only in RED material)
Patterns in Evolution I. Phylogenetic
Basics of Genetic Algorithms
Evolutionary genetics
Chapter 19 Molecular Phylogenetics
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
Presentation transcript:

Evolution & Design Principles in Biology: a consequence of evolution and natural selection Rui Alves University of Lleida Course Website:

Part I: Molecular Evolution

Theory of Evolution Evolution is the theory that allows us to understand how organisms came to be how they are In probabilistic terms, it is likely that all living beings today have originated from a single type of cells These cells divided and occupied ecological niches, where they adapted to the new environments through natural selection

How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication)

How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication)

How did the first cell create different cells? Neutral Mutation (e.g. by error in genome replication)

How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication)

How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication)

How did the first cell create different cells? Deleterious Mutation (e.g. by error in genome replication)

How did the first cell create different cells? Advantageous Mutation (e.g. by error in genome replication)

How did the first cell create different cells? Advantageous Mutation (e.g. by error in genome replication)

And then there was sex…

Why Sex??? Asexual reproduction is quicker, easier  more offspring/individual. Sex may limit harmful mutations – Asexual: all offspring get all mutations – Sexual: Random distribution of mutations. Those with the most harmful ones tend not to reproduce. Generate beneficial gene combinations – Adaptation to changing environment – Adaptation to all aspects of constant environment – Can separate beneficial mutations from harmful ones – Sample a larger space of gene combinations

New Niche/ New conditions in old niche What drives cells to adapt?

New (better adapted) mutation What drives cells to adapt?

How do New Genes and Proteins appear? Genes (Proteins) are build by combining domains New proteins may appear either by intradomain mutation of by combining existing domains of other proteins Cell Division Cell Division … …

The Coalescent This model of cellular evolution has implications for molecular evolution Coalescent Theory: a retrospective model of population genetics that traces all alleles of a gene in a sample from a population to a single ancestral copy shared by all members of the population, known as the most recent common ancestoralleles genemost recent common ancestor

Why is the coalescent the de facto standard today? Alternatives? Current sequences have evolved from the same original sequence (Coalescent) Current sequences have converged to a similar sequence from multiple origins of life

Back of the envelop support for ? ACDEFGHIKLMNPQRSTVWY 20 A EDYAHIKLMNPQRGTVWY 20 AAiAAk AAiAAk AAi Convergence Divergence Which is more likely? Back of the envelop support for divergence

About the mutational process Point mutations: Transitions (A↔G, C↔T) are more frequent than transversions (all other substitutions) In mammals, the CpG dinucleotide is frequently mutated to TG or CA (possibly related to the fact that most CpG dinucleotides are methylated at the C-residues) Microsatellites frequently increase or decrease in size (possibly due to polymerase slippage during replication) Gene and genome duplications (complete or partial), may lead to: pseudogenes: function-less copies of genes which rapidly accumulate (mostly deleterious) mutations, useful for estimating mutation rates! new genes after functional diversification Chromosomal rearrangements (inversions and translocation), may lead to meiotic incompatibilities, speciation Estimated mutation rates: Human nuclear DNA: 3-5×10 -9 per year Human mitochondrial DNA: 3-5×10 -8 per year RNA and retroviruses: ~10 -2 per year

Consequences of the coalescent model?

So what if we accept the coalescent model? A1TSRISEIRR A2TSRISEIRR A3TSRISEIRR A4TSRISEIRR A5TSRISEIRR A6TSRISEIRR A7PSRISEIRR A8PKRISEVRR A9PKRISEVRR A10PQRISAIQR A11PQRISAIQR A12PQRISTIQR A13PQRISTIQR A14ASHLHNLQR A15TKHLQELQRE A16TKHLQELQRE A17TKHLQELQRE A18SKHLHELQRD A19PKNLHELQKD A20SKRLHEVQSE A1-6TSRISEIRR A7PSRISEIRR A8-9PKRISEVRR A10-11PQRISAIQR A12-13PQRISTIQR A14ASHLHNLQR A15-17TKHLQELQR A18SKHLHELQR A19PKNLHELQK A20SKRLHEVQS

So what if we accept the coalescent model? A1-6TSRI SEI RR A7PSRI SEI RR A8-9PKRI SEVRR A10-11PQRI SAI QR A12-13PQRI STI QR A14ASHLHNLQR A15-17TKHLQELQR A18SKHLHELQR A19PKNLHELQK A20SKRLHEVQS A1-6 A7 A10-11 A12-A13 A’1-7 A’10-13

So what if we accept the coalescent model? A’1-7(p-t) SRI S E I RR A8-9 P KRI S E VRR A’10-13 P QRI S(a-t)I QR A14 A SHLH N LQR A15-17 T KHLQ E LQR A18 S KHLH E LQR A19 P KNLH E LQK A20 S KRLH E VQS The study of sequence alignments can gives information about the evolution of the different organisms!!!!

Phylogenetic tree reconstruction, overview Computational challenge : There is an enormous number of different topologies even for a relatively small number of sequences: 3 sequences: 1 4 sequences: 3 5 sequences: sequences: 2,027, sequences: 221,643,095,476,699,771,875 Consequence: Most tree construction algorithm are heuristic methods not guaranteed to find the optimal topology. Input data for two major classes of algorithms: 1.Input data distance matrix, examples UPGMA, neighbor-joining 2. Input data multiple alignment: parsimony, maximum likelihood Distance matrix methods use distances computed from pairwise or multiple alignments as input.

Building phylogenetic trees of proteins Genome 1 Genome 2 Genome 3 Genome … Protein AProtein BProtein CProtein D Protein AProtein BProtein C Protein D Protein A Protein BProtein C Protein D …

Distance based phylogenetic trees ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … A1 A2 A3 … A1 A2 A3 A1 5 substitutions 3 substitutions A2 A3 8 substitutions A2 A3 A1 3 5

Maximum likelihood phylogenetic trees ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … Alignment Probability of aa substitution A - E D … A … … E D …

Maximum likelihood phylogenetic trees ACTDEEGGGGSRGHI… A-TEEDGGAASRGHI… ACFDDEGGGGSRGHL… … Alignment A1 A2 A3 A1 5 substitutions 3 substitutions A2 A3 8 substitutions p(1,2) p(1,3) p(2,3) p(2,3)>p(1,2)>p(1,3) A1 A3 A2 A3 A1

Statistical evaluation of trees: bootstrapping Motivation: Some branching patterns in a tree may be uncertain for statistical reasons (short sequences, small number of mutational events) Goal of bootstrapping: To assess the statistical robustness for each edge of the tree. Note that each edge divides the leave nodes into two subsets. For instance, edge 7–8 divides the leaves into subsets {1,2,3} and {4,5}.However, is this short edge statistically robust ? Method: Try to generate tree from subsets of input data as follows: Randomly modify input MSA by eliminating some columns and replacing them by existing ones, This results in duplication of columns. Compute tree for each modified input MSA. For each edge of the tree derived from the real MSA, determine the fraction of trees derived from modified MSAs which contain an edge that divides the leaves into the same subsets. This fraction is called the bootstrap value. Edges with low bootstrap values (e.g. <0.9) are considered unreliable.

Statistical evaluation of trees: bootstrapping

Other Trees Use genomes Use Enzymomes Use whatever group of molecules are important for a given function

Part II: Design principles

Outline What are design principles  How to study design principles Examples

What are design principles? Recurrent qualitative or quantitative rules that are observed in similar types of systems as a solution to a given functional problem Exist at different levels Nuclear Targeting Sequences Operon Gene 1 Gene 2Gene 3

How can design principles emerge in molecular biology? Inteligent design? Not a scientific hypothesis; out of the table Evolution? Makes sense, but how could such regularities emerge?

Climbing down mount improbable Overtime, edged stones would accumulate on the slope. Smooth, round, stones accumulate at the bottom. Design Principles: - Smooth, roundish rocks roll down the mountain. - Edged, flat, rocks don’t.

Design principles in molecular biology Similarly, if a topology or set of parameters has appeared through mutation and it can be shown to create a molecular network that functionally outperforms all other possible alternatives in a given set of conditions, one can talk about a design principle for the system under those conditions. [sensu engineering]

Index of talk How to identify design principles Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development Design principles, what are they good for? Summary

First step, define the alternatives Gene Regulator + Gene Regulator _ X0X1X2X3

First step, define the alternatives X0X1X2X3 X3 t How strong should the feedback be?

Then, create models for each alternative Gene Regulator + Gene Regulator _

Finally: Compare the dynamic behavior of the models for the two or more alternatives with respect to physiologically relevant criteria.

Then, create models for each alternative X0X1X2X3

Index of talk How to identify design principles Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development Design principles, what are they good for? Summary

The demand theory for gene expression Are there situations where positive regulation of gene expression outperforms negative regulation of gene expression and vice versa? Gene Regulator + Gene Regulator _

Regulating gene expression has principles Positive regulator: – More effective when gene product in demand for large fraction of life cycle. – Less noise sensitive if signal is low. Negative regulator: – More effective when gene product in demand for small fraction of life cycle. – Less noise sensitive if signal is high. Gene Regulator + Gene Regulator _ Genetics 149:1665; PNAS 103:3999; PNAS 104:7151;Nature 405: 590

Index of talk How to identify design principles Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development Design principles, what are they good for? Summary

Negative overall feedback is a design principle in metabolic biosynthesis X0X1X2X3 Negative overall feedback: – More effective in coupling production to demand. – More robust to fluctuations. Bioinformatics 16:786; Biophysical J. 79:2290

Index of talk How to identify design principles Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development Design principles, what are they good for? Summary

Bifunctional sensors can be a design principle in signal transduction Bifunctional sensor: – Performs best against cross talk Independent deactivator: – Better integrator of signals Mol. Microbiol. 48:25; Mol. Microbiol. 68: 1196 Signal Sensor Efector Deactivator Effect

Index of talk How to identify design principles Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development Design principles, what are they good for? Summary

Gene Regulator _ Design principles in development Gene Regulator + High demand, low signal Signal + High demand, high signal Low demand, high signal Low demand, low signal Signal _ Genetics 149:1665; PNAS 103:3999; PNAS 104:7151;Nature 405: 590

Index of talk How to identify design principles Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development Design principles, what are they good for? Summary

Biological design principles are good to understand why biology works as it does Biological design principles may connect molecular determinants to functional effectiveness. Heat shock Expression of important genes time Growth rate time BMC Bioinformatics 7:184

Underlying assumption Evolution of molecular networks can be treated as modules. Work in the group of Uri Alon suggests that – networks evolving to meet simultaneous goals evolve in a modular fashion – Networks evolving to meet a single goal evolve globally Modularity seems like a reasonable first assumption PNAS 102:13773; PLOS Comp Biol 4:e ;BMC Evol biol 7: 169

The good news about function Sometimes, you get stuff for free!!! For example: – networks that are responsive to signals, just because they are responsive may have inbuilt buffering of noise. – Functions that are associated with marginally stable proteins are favored because due to the large dimensions of sequence space most randomly selected sequences have a structure that is marginally stable. PNAS 100:14463; PNAS 103:6435; Proteins 46:105

How can biological design principles be applied? Design of molecular circuits with specific behaviors!! Stable Systems Unstable systems Oscilations Bistable systems Cell 113: 597; PLoS Comput Biol. 5:e ; PNAS 106: 6435

Index of talk How to identify design principles Design principles in: – Gene expression – Metabolic networks – Signal transduction – Development Design principles, what are they good for? Summary

Design principles can be found in molecular networks. Such principles can sometimes be connected to selection for function effectiveness. Even in the absence of such a connection, if they are valid they can be used to build biological circuits with specific behaviors.