Molecular basis of evolution.

Slides:



Advertisements
Similar presentations
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Advertisements

Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
An Introduction to Phylogenetic Methods
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Systematics, the scientific study of the diversity of organisms, reveals the evolutionary relationships between organisms. Taxonomy,
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Molecular Clock I. Evolutionary rate Xuhua Xia
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Molecular Phylogeny Fredj Tekaia Institut Pasteur
14 Molecular Evolution and Population Genetics
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Review of cladistic technique Shared derived (apomorphic) traits are useful in understanding evolutionary relationships Shared primitive (plesiomorphic)
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Molecular Evolution, Part 2 Everything you didn’t want to know… and more! Everything you didn’t want to know… and more!
With astonishing advance of the Human Genome Project, essentially all human genomic sequences are available in public databases. The major task for the.
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Phylogenetic trees Sushmita Roy BMI/CS 576
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
Christian M Zmasek, PhD 15 June 2010.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Phylogenetics and Coalescence Lab 9 October 24, 2012.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
Phylogenetic Prediction Lecture II by Clarke S. Arnold March 19, 2002.
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
16. Molecular Phylogenetics
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Identifying and Modeling Selection Pressure (a review of three papers) Rose Hoberman BioLM seminar Feb 9, 2004.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny Ch. 7 & 8.
NEW TOPIC: MOLECULAR EVOLUTION.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Modelling evolution Gil McVean Department of Statistics TC A G.
Reconstructing and Using Phylogenies 16. Concept 16.1 All of Life Is Connected through Its Evolutionary History All of life is related through a common.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Phylogeny and the Tree of Life
Sequence similarity, BLAST alignments & multiple sequence alignments
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Linkage and Linkage Disequilibrium
In-Text Art, Ch. 16, p. 316 (1).
Multiple Alignment and Phylogenetic Trees
Methods of molecular phylogeny
Biological Classification: The science of taxonomy
Patterns in Evolution I. Phylogenetic
Molecular Clocks Rose Hoberman.
Molecular Evolution.
Summary and Recommendations
Evolutionary genetics
Chapter 19 Molecular Phylogenetics
#30 - Phylogenetics Distance-Based Methods
Chapter 6 Clusters and Repeats.
Summary and Recommendations
Presentation transcript:

Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic trees were constructed based on the comparative morphology and physiology. Molecular phylogenetics: phylogenetic trees are constructed by comparing DNA/protein sequences between organisms.

Evolution of mankind. Analysis of mitochondrial DNA proposes that Homo sapiens evolved from one group of Homo erectus in Africa (African Eve) 100,000 – 200,000 years ago. American indians I, 25-35,000 Europeans 40-50,000 American indians II, 7-9,000 Asians 55-75,000 Africans 100,000 Adam appeared 250,000 years ago, much earlier!

Mechanisms of evolution. Evolution is caused by mutations of genes. Mutations spread through the population via genetic drift and/or natural selection. If mutant gene produces an advantage (new morphological character), this feature will be inherited by all descendant species.

Mutational changes of DNA sequences. 1. Substitution. 3. Insertion. Thr Tyr Leu Leu Thr Tyr Leu Leu ACC TAT TTG CTG ACC TAT TTG CTG ACC TCT TTG CTG ACC TAC TTT GCT G— Thr Tyr Leu Leu Thr Tyr Phe Ala 2. Deletion. 4. Inversion. Thr Tyr Leu Leu Thr Tyr Leu Leu ACC TAT TTG CTG ACC TAT TTG CTG ACC TAT TGC TG- ACC TTT ATG CTG Thr Tyr Cys Thr Phe Met Leu

Gene duplication and recombination. New genes/proteins occur through the gene duplication and recombination. Gene 1 Ancestral globin + duplication Gene 2 globin globin hemoglobin myoglobin New gene Duplication Recombination

Codon usage. Phe UUU Ser UCU Tyr UAU UUC UCC UAC Leu UUA UCA Cys UGU UUG UCG UGC Frequencies of different codons for the same amino acid are different. Codon usage bias is caused: Translationary machinery tends to use abundant tRNA (and codons corresponding to these tRNA). Codon usage bias is the same for all highly expressed genes in the same organism. Mutation pressure. Difference between mutation rates between GC  AT and AT  GC. GC-content is different in different organisms.

Synonymous and nonsynonymous nucleotide substitutions. Synonymous substitutions in codons do not change the encoding amino acid, occur in the first and third codon positions. Nonsynonymous occur in the second position. ds/dn < 1 indicates positive natural selection. ds, dn - # of (non)synonymous substitutions per (non)synonymous site

Measures of evolutionary distance between amino acid sequences. Evolutionary distance is usually measures by the number of amino acid substitutions. P-distance. nd – number of amino acid differences between two sequences; n – number of aligned amino acids.

Poisson correction for evolutionary distance. Takes into account multiple substitutions and therefore is proportional to divergence time. PC-distance – total # of substitutions per site for two sequences

Gamma-distance. Substitution rate varies from site to site according to gamma-distribution. a – gamma-parameter, describing the shape of the distribution, =0.2-3.5. When P<0.2, there is no need to use gamma-distance.

Estimation of evolutionary rates in hemoglobin alpha-chains. P-distance PC-distance Gamma-distance Human/cow 0.121 0.129 0.134 Human/kangaroo 0.186 0.205 0.216 Human/carp 0.486 0.665 0.789 To estimate the evolutionary rate of divergence between human and cow (time of divergence between these groups is ~90 millions years), r = 0.129 / (2*90*10^6) = 0.717*10^-9 per site per year.

Another method to estimate evolutionary distances: amino acid substitution matrices. Substitutions occur more often between amino acids of similar properties. Dayhoff (1978) derived first matrices from multiple alignments of close homologs. The number of aa substitutions is measured in terms of accepted point mutations (PAM) – one aa substitution per 100 sites. Dayhoff-distance can be approximated by gamma-distance with a=2.25.

Fixation of mutations. Not all mutations are spread through population. Fixation – when a mutation is incorporated into a genome of species. Majority of mutations are neutral (Kimura), do not effect the fitness of organism. Fixation rate will depend on the size of population (N), fitness (s) and mutation rate (μ):

Phylogenetic analysis. Phylogenetic trees are derived from multiple sequence alignments. Each column describes the evolution of one site. Each position/site in proteins/nucleic acids changes in evolution independently from each other. Insertions/deletions are ususally ignored and trees are constructed only from the aligned regions.

Evolutionary tree constructed from rRNA analysis.

The concept of evolutionary trees. - Trees show relationships between organisms. Trees consist of nodes and branches, topology - branching pattern. The length of each branch represents the number of substitutions occurred between two nodes. If rate of evolution is constant, branches will have the same length (molecular clock hypothesis). Trees can be binary or bifurcating. Trees can be rooted and unrooted. The root is placed by including a taxon which is known to branch off earlier than others.

Accuracies of phylogenetic trees. Two types of errors: Topological error Branch length error Bootstrap test: Resampling of alignment columns with replacement; recalculating the tree; counting how many times this topology occurred – “bootstrap confidence value”. If it is >0.95 – reliable topology/interior branch.

Methods for phylogenetic trees construction. Set of related sequences Multiple sequence alignments Strong sequence similarity? Maximum parsimony methods Yes No Recognizable sequence similarity? Yes Distance methods No Analyze reliability of prediction Maximum likelihood methods

Calculating branch lengths from distances. ----- 20 30 44 a c b

1. Distance methods: Neighbor-joining method. NJ is based on minimum evolution principle (sum of branch length should be minimized). Given the distance matrix between all sequences, NJ joins sequences in a tree so that to give the estimate of branch lengths. Starts with the star tree, calculates the sum of branch lengths. C B b c D a d e A E

Neighbor-joining method. 2. Combine two sequences in a pair, modify the tree. Recalculate the sum of branch lengths, S for each possible pair, choose the lowest S. C B c b d D a e A E 3. Treat cluster CDE as one sequence “X”, calculate average distances between “A” and “X”, “B” and “X”, calculate “a” and “b”. 4. Treat AB as a single sequence, recalculate the distance matrix. 5. Repeat the cycle and calculate the next pair of branch lengths.

Classwork I Given a multiple sequence, construct distance matrix (p-distance) and calculate the branch lengths. APTHASTRLKHHDDHH ALTKKSTRIRHIPD-H DLTPSSTIIR-YPDLH

Classwork II: NJ tree using MEGA. Go to CDD webpage and retrieve alignment of cd00157 in FASTA format. Import this alignment into MEGA and convert it to MEGA format http://www.megasoftware.net/mega3/mega.html . http://bioweb.pasteur.fr/seqanal/interfaces/protdist-simple.html 3. Construct NJ tree using different distance measures with bootstrap. 4. Analyze obtained trees.