CS 177 Phylogenetics I Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Model of sequence evolution Phylogenetic trees.

Slides:



Advertisements
Similar presentations
Classification of Organisms
Advertisements

LG 4 Outline Evolutionary Relationships and Classification
Introduction Classification Phylogeny Cladograms Quiz
Phylogeny and the Tree of Life
Lecture 4: Phylogeny and the Tree of Life Campbell: Chapter 26
LEQ: How do biologist organize living things?
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Tree of Life Chapter 26.
Phylogenetic Trees Systematics, the scientific study of the diversity of organisms, reveals the evolutionary relationships between organisms. Taxonomy,
Phylogeny and Systematics
Fig Copyright © 2008 Pearson Education, Inc., publishing as Pearson Benjamin Cummings Overview: Investigating the Tree of Life Phylogeny is the.
Nomenclature is the science of naming organisms Evolution has created an enormous diversity, so how do we deal with it? Names allow us to talk about groups.
Reconstructing and Using Phylogenies
Systematics Study of the diversity of organisms to classify them and determine their evolutionary relationships Taxonomy: naming, identifying and classifying.
PHYLOGENY AND SYSTEMATICS
Phylogeny and Systematics By: Ashley Yamachika. Biologists use systematics They use systematics as an analytical approach to understanding the diversity.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Taxonomy To sort organisms into species To classify species into higher taxonomic levels A taxon is a taxonomic unit at any level; for example “Mammalia”
SYSTEMATICS The study of biological diversity in an evolutionary context encompasses both taxonomy and phylogeny.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Phylogeny & The Tree of Life. Phylogeny  The evolutionary history of a species or group of species.
Phylogeny and the Tree of Life
Systematics The study of biological diversity in an evolutionary context.
Terminology of phylogenetic trees
Systematics the study of the diversity of organisms and their evolutionary relationships Taxonomy – the science of naming, describing, and classifying.
Classification and Systematics Tracing phylogeny is one of the main goals of systematics, the study of biological diversity in an evolutionary context.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Models of sequence evolution GTR HKY Jukes-Cantor Felsenstein K2P Tree building methods: some examples Assessing phylogenetic data Popular phylogenetic.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Warm-Up 1.Contrast adaptive radiation vs. convergent evolution? Give an example of each. 2.What is the correct sequence from the most comprehensive to.
Systematics and the Phylogenetic Revolution Chapter 23.
Chapter 26 Phylogeny and the Tree of Life
Lecture 2: Principles of Phylogenetics
Introduction to Phylogenetics
PHYLOGENY and SYSTEMATICS CHAPTER 25. VOCABULARY Phylogeny – evolutionary history of a species or related species Systematics – study of biological diversity.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogenies Reconstructing the Past. The field of systematics Studies –the mechanisms of evolution evolutionary agents –the process of evolution speciation.
Phylogeny & the Tree of Life
PHYLOGENY AND THE TREE OF LIFE CH 26. I. Phylogenies show evolutionary relationships A. Binomial nomenclature: – Genus + species name Homo sapiens.
Classification. Cell Types Cells come in all types of shapes and sizes. Cell Membrane – cells are surrounded by a thin flexible layer Also known as a.
Classification.
Classification and Phylogenetic Relationships
All life is interconnected by descent
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
{ Early Earth and the Origin of Life Chapter 15.  The Earth formed 4.6 billion years ago  Earliest evidence for life on Earth  Comes from 3.5 billion-year-old.
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Chapter 26 Phylogeny and Systematics. Tree of Life Phylogeny – evolutionary history of a species or group - draw information from fossil record - organisms.
Chapter 25: Phylogeny and Systematics Phylogeny = the evolutionary history of a species Systematics = study of biological diversity in an evolutionary.
Classification Biology I. Lesson Objectives Compare Aristotle’s and Linnaeus’s methods of classifying organisms. Explain how to write a scientific name.
Chapter 26 Phylogeny and the Tree of Life
Phylogeny. Intro: Why study evolutionary relationships? Legless lizards and snakes look like they could be considered the same species By studying evolutionary.
Reconstructing and Using Phylogenies 16. Concept 16.1 All of Life Is Connected through Its Evolutionary History All of life is related through a common.
Phylogeny & Systematics The study of the diversity and relationships among organisms.
Phylogeny & the Tree of Life
Phylogenic trees..
Phylogenetics
Phylogeny and Systematics
Topics Need for systematics Applications of systematics
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Hierarchical Classification vs. Systematics
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Phylogeny and the Tree of Life
Phylogeny and Systematics
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Phylogeny and the Tree of Life
Phylogenetics Chapter 26.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Phylogeny & Systematics
Presentation transcript:

CS 177 Phylogenetics I Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Model of sequence evolution Phylogenetic trees and networks Cladistic and phenetic methods Computer software and demos Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetic Inference I A science primer: Phylogenetics Brown, S.M. (2000) Bioinformatics, Eaton Publishing, pp Brown, S.M.: Molecular Phylogenetics Hillis, D.M.; Moritz, G. & Mable, B.K. (1996) Molecular Systematics, 2. Edition, Sinauer Associates, 655 pp. Mount, D.W. (2001) Bioinformatics, Cold Spring Harbor Lab Press, pp Recommended readings (very) basic advanced Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

CS 177 Phylogenetic Inference I The theory of evolution is the foundation upon which all of modern biology is built Evolution From anatomy to behavior to genomics, the scientific method requires an appreciation of changes in organisms over time It is impossible to evaluate relationships among gene sequences without taking into consideration the way these sequences have been modified over time Ernst Haeckel ( ) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

CS 177 Phylogenetic Inference I Similarity searches and multiple alignments of sequences naturally lead to the question “How are these sequences related?” and more generally: “How are the organisms from which these sequences come related?” Relationships Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Classifying Organisms Nomenclature is the science of naming organisms Evolution has created an enormous diversity, so how do we deal with it? Names allow us to talk about groups of organisms. - Scientific names were originally descriptive phrases; not practical - Binomial nomenclature > Developed by Linnaeus, a Swedish naturalist > Names are in Latin, formerly the language of science > binomials - names consisting of two parts > The generic name is a noun. > The epithet is a descriptive adjective. - Thus a species' name is two words e.g. Homo sapiens Carolus Linnaeus ( ) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Classifying Organisms Taxonomy is the science of the classification of organisms Taxonomy deals with the naming and ordering of taxa. The Linnaean hierarchy: 1. Kingdom 2. Division 3. Class 4. Order 5. Family 6. Genus 7. Species Evolutionary distance Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Systematics is the science of the relationships of organisms Systematics is the science of how organisms are related and the evidence for those relationships Systematics is divided primarily into phylogenetics and taxonomy Speciation -- the origin of new species from previously existing ones - anagenesis - one species changes into another over time - cladogenesis - one species splits to make two Classifying Organisms Reconstruct evolutionary history Phylogeny Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetics Review of protein structures Need for analyses of protein structures Sources of protein structure information Computational Modeling Phylogenetics is the science of the pattern of evolution. A. Evolutionary biology is the study of the processes that generate diversity, while phylogenetics is the study of the pattern of diversity produced by those processes. B. The central problem of phylogenetics: 1. How do we determine the relationships between species? 2. Use evidence from shared characteristics, not differences 3. Use homologies, not analogies 4. Use derived condition, not ancestral a. synapomorphy - shared derived characteristic b. plesiomorphy - ancestral characteristic C. Cladistics is phylogenetics based on synapomorphies. 1. Cladistic classification creates and names taxa based only on synapomorphies. 2. This is the principle of monophyly 3. monophyletic, paraphyletic, polyphyletic 4. Cladistics is now the preferred approach to phylogeny The phylogeny and classification of life as proposed by Haeckel (1866)

Phylogenetics Evolutionary theory states that groups of similar organisms are descended from a common ancestor. Phylogenetic systematics is a method of taxonomic classification based on their evolutionary history. It was developed by Hennig, a German entomologist, in Willi Hennig ( ) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetics Phylogenetics is the science of the pattern of evolution Evolutionary biology versus phylogenetics - Evolutionary biology is the study of the processes that generate diversity - Phylogenetics is the study of the pattern of diversity produced by those processes Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetics Who uses phylogenetics? Some examples: Evolutionary biologists (e.g. reconstructing tree of life) Systematists (e.g. classification of groups) Anthropologists (e.g. origin of human populations) Forensics (e.g. transmission of HIV virus to a rape victim) Parasitologists (e.g. phylogeny of parasites, co-evolution) Epidemiologists (e.g. reconstruction of disease transmission) Genomics/Proteomics (e.g. homology comparison of new proteins) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetic trees The central problem of phylogenetics: how do we determine the relationships between taxa? in phylogenetic studies, the most convenient way of presenting evolutionary relationships among a group of organisms is the phylogenetic tree Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetics Review of protein structures Need for analyses of protein structures Sources of protein structure information Computational Modeling Phylogenetics is the science of the pattern of evolution. A. Evolutionary biology is the study of the processes that generate diversity, while phylogenetics is the study of the pattern of diversity produced by those processes. B. The central problem of phylogenetics: 1. How do we determine the relationships between species? 2. Use evidence from shared characteristics, not differences 3. Use homologies, not analogies 4. Use derived condition, not ancestral a. synapomorphy - shared derived characteristic b. plesiomorphy - ancestral characteristic C. Cladistics is phylogenetics based on synapomorphies. 1. Cladistic classification creates and names taxa based only on synapomorphies. 2. This is the principle of monophyly 3. monophyletic, paraphyletic, polyphyletic 4. Cladistics is now the preferred approach to phylogeny

Phylogenetic trees Node: a branchpoint in a tree (a presumed ancestral OTU) Branch: defines the relationship between the taxa in terms of descent and ancestry Topology: the branching patterns of the tree Branch length (scaled trees only): represents the number of changes that have occurred in the branch Root: the common ancestor of all taxa Clade: a group of two or more taxa or DNA sequences that includes both their common ancestor and all their descendents Operational Taxonomic Unit (OTU): taxonomic level of sampling selected by the user to be used in a study, such as individuals, populations, species, genera, or bacterial strains Root Branch Clade Node Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetic trees There are many ways of drawing a tree Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetic trees There are many ways of drawing a tree = Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy = ECD B A

Phylogenetic trees There are many ways of drawing a tree == Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy no meaning

Phylogenetic trees There are many ways of drawing a tree Bifurcation Trifurcation = / Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy Bifurcation versus Multifurcation (e.g. Trifurcation) Multifurcation (also called polytomy): a node in a tree that connects more than three branches. A multifurcation may represent a lack of resolution because of too few data available for inferring the phylogeny (in which case it is said to be a soft multifurcation) or it may represent the hypothesized simultaneous splitting of several lineages (in which case it is said to be a hard multifurcation).

Phylogenetic trees Trees can be scaled or unscaled (with or without branch lengths) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetic trees Trees can be unrooted or rooted Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy D A C B Unrooted tree ACB D Root Rooted tree D A C B Root ACB D

Phylogenetic trees Trees can be unrooted or rooted Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy These trees show five different evolutionary relationships among the taxa!

Phylogenetic trees Possible evolutionary trees Taxa (n)Unrooted/rooted 2 21/1 31/3 43/ Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy Taxa (n):

Phylogenetic trees Possible evolutionary trees Taxa (n)rooted (2n-3)!/(2n-2(n-2)!) unrooted (2n-5)!/(2n-3(n-3)!) , ,13510,395 92,027,025135, ,459,4252,027,025 Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetic trees How to root? Use information from ancestors Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy In most cases not available

Phylogenetic trees How to root? Use statistical tools will root trees automatically (e.g. mid-point rooting) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy This must involve assumptions … BEWARE!

Phylogenetic trees How to root? Using “outgroups” Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy - the outgroup should be a taxon known to be less closely related to the rest of the taxa (ingroups) - it should ideally be as closely related as possible to the rest of the taxa while still satisfying the above condition

Phylogenetic trees Exercise: rooted/unrooted; scaled/unscaled A ED C B F Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy

Phylogenetics What are useful characters? Use homologies, not analogies! - Homology: common ancestry of two or more character states - Analogy: similarity of character states not due to shared ancestry - Homoplasy: a collection of phenomena that leads to similarities in character states for reasons other than inheritance from a common ancestor (e.g. convergence, parallelism, reversal) Homoplasy is huge problem in morphology data sets! But in molecular data sets, too! Cactaceae (cactus spines are modified leaves) Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses Euphorbiaceae (euphorb spines are modified shoots)

Phylogenetics Molecular data and homoplasy gene sequences represent character data characters are positions in the sequence (not all workers agree; some say one gene is one character) character states are the nucleotides in the sequence (or amino acids in the case of proteins) Problems: the probability that two nucleotides are the same just by chance mutation is 25% what to do with insertions or deletions (which may themselves be characters) homoplasy in sequences may cause alignment errors Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses

Phylogenetics Molecular data and homoplasy: Orthologs vs. Paralogs When comparing gene sequences, it is important to distinguish between identical vs. merely similar genes in different organisms Orthologs are homologous genes in different species with analogous functions Paralogs are similar genes that are the result of a gene duplication A phylogeny that includes both orthologs and paralogs is likely to be incorrect Sometimes phylogenetic analysis is the best way to determine if a new gene is an ortholog or paralog to other known genes Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses

Phylogenetics What are useful characters? Use derived condition, not ancestral - Synapomorphy (shared derived character): homologous traits share the same character state because it originated in their immediate common ancestor - Plesiomorphy (shared ancestral character”): homologous traits share the same character state because they are inherited from a common distant ancestor Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses

Phenetic methods construct trees (phenograms) by considering the current states of characters without regard to the evolutionary history that brought the species to their current phenotypes; phenograms are based on overall similarity Cladistic methods construct trees (cladograms) rely on assumptions about ancestral relationships as well as on current data; cladograms are based on character evolution (e.g. shared derived characters) Within the field of taxonomy there are two different methods and philosophies of building phylogenetic trees: cladistic and phenetic Cladistics is becoming the method of choice; it is considered to be more powerful and to provide more realistic estimates, however, it is slower than phenetic algorithms Phenetics versus cladistics

Phenetics vs. cladistics An example

Phenetics vs. cladistics Phenetic (overall similarity) A B C overall similarity 3 4 5

Phenetics vs. cladistics Cladistics (character evolution; e.g. shared derived characters) A B C shared derived characters 1 2 1

Model of sequence evolution The problem - A basic process in the evolution of a sequence is change in that sequence over time - Now we are interested in a mathematical model to describe that - It is essential to have such a model to understand the mechanisms of change and is required to estimate both the rate of evolution and the evolutionary history of sequences

Model of sequence evolution Pyrimidine (C 4 N 2 H 4 )Purine (C 5 N 4 H 4 ) Nucleotide base+ sugar+ phosphate Guanine Adenine Thymine Cytosine

Models of sequence evolution Examples Jukes-Cantor model (1969) All substitutions have an equal probability and base frequencies are equal

Models of sequence evolution Examples Felsenstein (1981) All substitutions have an equal probability, but there are unequal base frequencies

Models of sequence evolution Examples Kimura 2 parameter model (K2P) (1980) Transitions and transversions have different probabilities

Models of sequence evolution Examples Hasegawa, Kishino & Yano (HKY) (1985) Transitions and transversions have different probabilities, base frequencies are unequal

Models of sequence evolution Examples General time reversible model (GTR) Different probabilities for each substitution, base frequencies are unequal

Models of sequence evolution GTR HKY A C T G       Jukes-Cantor Felsenstein K2P

More models of sequence evolution … Currently, there are more than 60 models described - plus gamma distribution and invariable sites - accuracy of models rapidly decreases for highly divergent sequences - problem: more complicated models tend to be less accurate (and slower) How to pick an appropriate model? - use a maximum likelihood ratio test - implemented in Modeltest 3.06 (Posada & Crandall, 1998)