Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.

Slides:



Advertisements
Similar presentations
Cao et al. (2000) Gene Phylogeny of Mammals a good example where molecular sequences have led to a big improvement of our understanding of evolution.
Advertisements

Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Neighbour joining method The neighbor joining method is a greedy heuristic which joins at each step, the two closest sub-trees that are not already joined.
An Introduction to Phylogenetic Methods
Wellcome Trust Workshop Working with Pathogen Genomes Module 6 Phylogeny.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
IE68 - Biological databases Phylogenetic analysis
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Molecular Phylogeny Fredj Tekaia Institut Pasteur
© Wiley Publishing All Rights Reserved. Phylogeny.
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
Bioinformatics and Phylogenetic Analysis
In addition to maximum parsimony (MP) and likelihood methods, pairwise distance methods form the third large group of methods to infer evolutionary trees.
BME 130 – Genomes Lecture 26 Molecular phylogenies I.
Phylogenetic reconstruction
07/05/2004 Evolution/Phylogeny Introduction to Bioinformatics MNW2.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Chapter 2 Opener How do we classify organisms?. Figure 2.1 Tracing the path of evolution to Homo sapiens from the universal ancestor of all life.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic Analysis
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Terminology of phylogenetic trees
Molecular phylogenetics
Christian M Zmasek, PhD 15 June 2010.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
plants animals monera fungi protists protozoa invertebrates vertebrates mammals Five kingdom system (Haeckel, 1879)
Johns Hopkins University - Fall 2003 Phylogenetics & Computational Genomics Lecture #6 Page 1 Week6: Intro to Phylogenetic Reconstruction.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Introduction to Phylogeny Cédric Notredame Centro de Regulacio Genomica Adapted from Aiden Budd’s Lecture on Phylogeny.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Phylogeny and Genome Biology Andrew Jackson Wellcome Trust Sanger Institute Changes: Type program name to start Always Cd to phyml directory before starting.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Phylogenetics.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
CISC667, S07, Lec25, Liao1 CISC 467/667 Intro to Bioinformatics (Spring 2007) Review Session.
Phylogeny and the Tree of Life
Phylogenetic basis of systematics
Inferring a phylogeny is an estimation procedure.
Phylogenetic Inference
Goals of Phylogenetic Analysis
Molecular basis of evolution.
Phylogenetic Trees.
Molecular Evolution.
#30 - Phylogenetics Distance-Based Methods
Phylogeny.
Presentation transcript:

Lecture 3 Molecular Evolution and Phylogeny

Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers of apparently homlogous intra-genomic (paralog) and inter-genomic (ortholog) genes Some genes, especially those related to the function of transcription and translation, are common to ALL life forms The closer two organisms seem to be phylogenetically, the more similar their genomes and corresponding genes are

Central dogma of molecular biology DNA RNA Protein

Closer related organisms have more similar genomes Highly similar genes are homologs (have the same ancestor) A universal ancestor exists for all life forms Molecular difference in homologous genes (or protein sequences) are positively correlated with evolution time Phylogenetic relation can be expressed by a dendrogram (a “tree”) Basic assumptions of molecular evolution

The five steps in phylogenetics dancing Modified from Hillis et al., (1993). Methods in Enzymology 224, Sequence data Align Sequences Phylogenetic signal? Patterns—>evolutionary processes? Test phylogenetic reliability Distances methods Choose a method MBML Characters based methods Single treeOptimality criterion Calculate or estimate best fit tree LSMENJ Distance calculation (which model?) Model? MP Wheighting? (sites, changes)? Model?

Why protein phylogenies? For historical reasons - first sequences... For historical reasons - first sequences... Most genes encode proteins... Most genes encode proteins... To study protein structure, function and To study protein structure, function and evolution evolution Comparing DNA and protein based Comparing DNA and protein based phylogenies can be useful phylogenies can be useful Different genes - e.g. 18S rRNA versus EF-2 proteinDifferent genes - e.g. 18S rRNA versus EF-2 protein Protein encoding gene - codons versus amino acidsProtein encoding gene - codons versus amino acids

Protein were the first molecular sequences to be used for phylogenetic inference Fitch and Margoliash (1967) Construction of phylogenetic trees. Science 155,

Statistical Physics and Biological Information Institute of Theoretical Physics University of California at Santa Barbara 2001 May 7 Most of what follows taken from:

Understanding trees Time 30 Mya Root 22 Mya 7 Mya same as

Understanding trees #2

Understanding trees #3

Difference in homologous sequences is a measure of evolution time Part of multiple sequence alignment of Mitochondrial Small Sub-Unit rRNA Full length is ~ primate species with mouse as outgroup 靈長目 Change similarity matrix to distance matrix : d = 1 - S

From alignment construct pairwise distance* *Note: Alignment is not the only way to compute distance

Models of sequence evolution

Jukes-Cantor (minimal) Model All substitution rates =  all base frequency = 1/4 AC = 3 P ij (2t)

Let probability of site being a base at time t be P(t) After elapse time  t mutate to other three bases is – 3  t P(t) Gain from other bases is  t (1 - P(t)) Hence P(t +  t) = P(t) – 3  t P(t) +  t (1 - P(t)) dP(t)/dt =  P(t) Write P(t) = a exp(-bt) +c, solution is b= , c=1/4 P(t) = a exp(-  t) +1/4 If P(0) = 1, then a = ¾. If P(0) = 0, then a = -1/4 Finally P same (t) =1/4 +3/4 exp(-  t) P change (t) =1/4 - 1/4 exp(-  t) Derivation of Jukes-Cantor formula

Transition A G or C T Transversion A T or C G Hasegawa-Kishino-Yano model Has a more general substitution rate

Part of Jukes-Cantor distance matrix for primate examples (is much larger; for outgroup) Matrix will be used for clustering methods

Clustering

UPGMA

Neighbor-Joining Method

N-J Method produces an Unrooted, Additive tree

What is required for the Neighbour joining method? Distance matrix 0. Distance Matrix Neighbor-Joining Method An Example

PAM distance 3.3 (Human - Monkey) is the minimum. So we'll join Human and Monkey to MonHum and we'll calculate the new distances. Mon-Hum MonkeyHumanSpinachMosquitoRice 1. First Step

After we have joined two species in a subtree we have to compute the distances from every other node to the new subtree. We do this with a simple average of distances: Dist[Spinach, MonHum] = (Dist[Spinach, Monkey] + Dist[Spinach, Human])/2 = ( )/2 = Mon-Hum MonkeyHumanSpinach 2. Calculation of New Distances

HumanMosquito Mon-Hum MonkeySpinachRice Mos-(Mon-Hum) 3. Next Cycle

HumanMosquito Mon-Hum MonkeySpinachRice Mos-(Mon-Hum) Spin-Rice 4. Penultimate Cycle

HumanMosquito Mon-Hum MonkeySpinachRice Mos-(Mon-Hum) Spin-Rice (Spin-Rice)-(Mos-(Mon-Hum)) 5. Last Joining

Human Monkey Mosquito Rice Spinach The result: Unrooted Neighbor-Joining Tree

Bootstrapping

Why are trees not exact?

Pairwise distances usually not tree-like

Searching tree space

Maximum likelihood criterion

Parsimony criterion

Parsimony with molecular data

Parsimony criterion Paul Higgs:

Is the best tree much better than others? L : likelihood at nodes

Use Maximum Likelihood to rank alternate trees yes same topology NJ tree is 2nd best

Use Parsimony to rank alternate trees different topology ; parsimony differentiates weakly

Quartet puzzling

MCMC: Markov chain with Monte Carlo

Topology probabilities according to MCMC

Clade probability compared from tree methods NJ method is very fast and close to being the best

Lecture and Book Lecture by Paul Higgs online.itp.ucsb.edu/online/infobio01/higgs/ see online.itp.ucsb.edu/online/infobio01/ for many lectures Book by Wen-Hsiong Li 李文雄 “Molecular Evolution” (Sinauer Associates, 1997)

CMS Molecular Biology Resource Phylogeny - Molecular Evolution The Tree of Life Web Project tolweb.org/tree/phylogeny.html Web Resources in Molecular Evolution and Systematics darwin.eeb.uconn.edu/molecular-evolution.html Some web sites on Molecular Evolution

On-line service clustalw.genome.ad.jp/ Softw are ftp-igbmc.u-strasbg.fr/pub/ClustalX/ ftp-igbmc.u-strasbg.fr/pub/ClustalW/ Some web sites on ClustalW