Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.

Slides:



Advertisements
Similar presentations
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Advertisements

Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Introduction to Phylogenies
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogeny and Systematics
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic Trees - I.
BIO2093 – Phylogenetics Darren Soanes Phylogeny I.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Reconstructing and Using Phylogenies
Molecular Evolution Revised 29/12/06
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Review of cladistic technique Shared derived (apomorphic) traits are useful in understanding evolutionary relationships Shared primitive (plesiomorphic)
Some basics: Homology = refers to a structure, behavior, or other character of two taxa that is derived from the same or equivalent feature of a common.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Molecular Evolution, Part 2 Everything you didn’t want to know… and more! Everything you didn’t want to know… and more!
Chapter 2 Opener How do we classify organisms?. Figure 2.1 Tracing the path of evolution to Homo sapiens from the universal ancestor of all life.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Phylogenetic trees Sushmita Roy BMI/CS 576
What Is Phylogeny? The evolutionary history of a group.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Terminology of phylogenetic trees
Molecular phylogenetics
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Phylogenetic Trees: Common Ancestry and Divergence 1B1: Organisms share many conserved core processes and features that evolved and are widely distributed.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Systematics and the Phylogenetic Revolution Chapter 23.
Chapter 8 Molecular Phylogenetics: Measuring Evolution.
Evolutionary Biology Concepts Molecular Evolution Phylogenetic Inference BIO520 BioinformaticsJim Lund Reading: Ch7.
Introduction to Phylogenetics
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
PHYLOGENY and SYSTEMATICS CHAPTER 25. VOCABULARY Phylogeny – evolutionary history of a species or related species Systematics – study of biological diversity.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
PHYLOGENY AND THE TREE OF LIFE CH 26. I. Phylogenies show evolutionary relationships A. Binomial nomenclature: – Genus + species name Homo sapiens.
Phylogeny & Systematics
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Chapter 26 Phylogeny and the Tree of Life
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Taxonomy & Phylogeny. B-5.6 Summarize ways that scientists use data from a variety of sources to investigate and critically analyze aspects of evolutionary.
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
What is phylogenetic analysis and why should we perform it? Phylogenetic analysis has two major components: (1) Phylogeny inference or “tree building”
Bioinformatics Lecture 3 Molecular Phylogenetic By: Dr. Mehdi Mansouri Mehr 1395.
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Linkage and Linkage Disequilibrium
Endeavour to reconstruct the characters of each hypothetical ancestor.
Molecular Evolution.
Summary and Recommendations
Phylogeny and the Tree of Life
Chapter 20 Phylogenetic Trees.
BCB 444/544 F07 ISU Terribilini #29- Phylogenetics
Unit Genomic sequencing
Phylogenetic Trees Jasmin sutkovic.
Chapter 26 Phylogeny and the Tree of Life
Chapter 20 Phylogeny and the Tree of Life
Summary and Recommendations
1 2 Biology Warm Up Day 6 Turn phones in the baskets
Phylogeny and the Tree of Life
Presentation transcript:

Chapter 10 Phylogenetic Basics

Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is the study of the evolutionary history of organisms Based on fossil data in the Victorian era, but more recently on molecular data Sequences in biological polymers provide a history of changes Advantages of molecular Phylogenetics: Molecular data more numerous than fossils No sampling bias involved More robust phylogenetic trees can be constructed Molecular evolution and molecular phylogenetics

Major assumptions Sequences used must be homologous Phylogenetic divergence is assumed to be bifurcating (=forking) Each position in the sequence evolved independently Variability is informative enough to construct unambiguous trees

Terminology branch taxon node root node clade monophyletic lineage dichotomy polytomy

A B C D A B CD unrooted rooted Unrooted tree No knowledge of common ancestor Relative relationships No evolutionary direction To root unrooted tree: Use outgroup (distant relation; e.g.. bird for mammal tree) Midpoint rooting (midpoint of two most divergent groups)

Gene phylogeny versus species phylogeny Objective of constructing molecular phylogenetic trees is to reconstruct the evolutionary history and relation ships between species or organisms The rate at which a gene evolves may not mirror that of a species Genes may arrive by horizontal transfer An internal node in a molecular phylogenetic tree represents a gen duplication, whereas in a species phylogenetic tree, it represents a speciation event To get accurate phylogenetics of species from molecular data require phylogenetic analysis of several gene or protein families

Forms of tree representation A BCDE ABCD E A B C D E AB C D E Cladogram Phylogram Non-scaled Scaled

Newick format A BCDE AB C D E (((B,C),A),(D,E)) (((B:1,C:2),A:2),(D:1.2,E:2.4))

Finding a tree may be difficult Number of possible tree topologies is a function of the number of taxa Rooted trees: N R = (2n-3)!/2 n-2 (n-2)! Unrooted trees: N U = (2n-5)!/2 n-3 (n-3)!

Procedure to construct a tree Choosing molecular markers Performing multiple sequence alignment Choose model of evolution Determining a tree-building method Assessing tree reliability

Choice of molecular markers DNA retains smaller changes (only 4 nucleotides) To study closely related organisms, use DNA For human population studies, use non-coding mitochondrial sequences More widely divergent groups, rRNA or protein sequences Comparing bacteria with eukaryotes, use conserved protein sequences Proteins more conserved to due degeneracy of codons Different evolutionary rates between nucleotides in codons DNA sequences biased because of codon preferences Two random DAN sequences will have 50% identity if gaps are allowed Random protein sequences only 10% identity Gaps in protein coding sequences are biologically meaningless Protein-based phylogeny preferable to nucleotide-based phylogeny DNA provides data on synonymous and non-synonymous substitution that provides information on positive and negative selection

Alignment Correct alignment crucial otherwise there will be errors in trees Use modern package such as T-coffee Manual verification and editing essential Secondary structure can serve as guide in alignment (Praline) Non-homologous regions may have to be removed (subjective) Remove Indels Gaps regions may belong to signature indels and contain phylogenetic information

Multiple substitutions The number of differences between two aligned sequence is an indication of their evolutionary distance … or does at? What about A->T->G->C? G->C->G? Such multiple substitutions and convergences obscure true evolutionary distances Known as homoplasy Need statistical models to correct for homoplasy

Jukes-Cantor Model Assumes all substitutions occur with same probability d AB = -(3/4)ln[1-(4/3)  AB ] d AB is evolutionary distance  AB observed sequences difference Two 10 nucleotide sequences that differ at three nucleotides:  AB = 0.3 d AB = -(3/4)ln[1-(4/3)0.3] = 0.38 Mostly for closely related sequences

Kimura Model d AB = -(1/2)ln(1-2  ti -  tv )-(1/4)ln(1-2  tv ) d AB evolutionary distance between two aligned sequences A and B  ti observed frequency for transition  tv observed frequency for transversion If 30% difference is due to 20% transitions and 10% transversion: d AB = -(1/2)ln( )-(1/4)ln( ) = 0.4 For protein sequences can use a PAM substitution matrix that includes evolutionary information Kimura model for proteins: d = -ln(1-p-0.2p 2 ) where p is observed pairwise distance

Among site variation In DNA mutation rate differs by codon position In proteins there are functional constraints Proportion of positions have invariant rates and others variable rates The distribution of variable sites follow a  distribution  -corrected Jukes-Cantor: d AB = (3/4)  [(1-4/3  AB ) -1/  -1]  -corrected Kimura: d AB = (  /2)[(1-2  ti -  tv ) -1/  -(1/2)(1-2  tv ) -1/  -1/2]