Molecular Systematics Systematics - the science of identifying, naming, and classifying living organisms into groups A natural activity of the human brain Aristotle - Scala Naturae, or “Chain of Life,” which consisted of God, man, mammals, oviparous with perfect eggs (e.g., birds), oviparous with nonperfect eggs (e.g., fish), insects, plants, and non-living matter. Dominated for ~2000 yrs but made no real attempt at an orderly, consistent classification Lineaus and others – downward classification Dividing larger groups into smaller ones via dichotomies Actually a method of ‘identification’ not ‘classification’ Highly dependent on the order in which the dichotomies were investigated Upward classification – Grouping organisms with similar characteristics Still, all of this was influenced heavily by the idea of archetypes, distinct types or kinds of organisms that are unchanging This was an attempt to find a ‘natural’ system. But, what is the basis for this ‘natural’ system?
Molecular Systematics Enter The Origin of Species Provided the rationale for a coherent system Common descent as the basis for classification Phylogenetic Systematics Classifying organisms based on evolutionary relationships “The time will come, I believe, though I shall not live to see it, when we shall have fairly true genealogical trees of each great kingdom of Nature” - Charles Darwin Therefore, 2 goals for phylogenetics (1) reconstruct life's geneology (2) use geneology as basis for classification.
Molecular Systematics What can we do with molecular phylogenies? Classify organisms according to evolutionary history bring order to the chaos of living things There is only one "fundamental law" in biology: life evolves. That's it. The messiness comes because life is an emergent property of chemistry and physics. Just like a grain of sand acts differently than a pile of sand, so too do physics and chemistry act differently than biology - the amount of interacting forces that happen in biology are orders of magnitudes more than in physics and chemistry
Molecular Systematics What can we do with molecular phylogenies? Determine evolutionary patterns and processes in organisms evolutionary rates among organisms (speciation, extinction, morphological change) identification of key adaptations correlations between traits or characters FOXP2 in bats and other mammals
Molecular Systematics What can we do with molecular phylogenies? Determine evolutionary patterns and processes in genomes
Molecular Systematics What can we do with molecular phylogenies? Inform conservation efforts 8 9 x Tabasco Peten
Molecular Systematics What can we do with molecular phylogenies? Inform medical and forensic genetics
Molecular Systematics What can we do with molecular phylogenies? Investigate population histories and demography
Molecular Systematics Tree – a mathematical model of a proposed evolutionary history of organisms or some aspect of organisms The ultimate goal of phylogenetics is to recover an accurate tree of life
Molecular Systematics (OTU)
Molecular Systematics Levels of resolution
Molecular Systematics Polytomies
Molecular Systematics Cladograms, phylograms, phenograms, etc… Cladogram – illustrates evolutionary relationships of organisms via relative common ancestry branch lengths are meaningless and arbitrary Phylogram – illustrates relationships of organisms with branch lengths proportional to time or similarity a subset of cladograms Phenogram – illustrates relative amounts of similarity or difference (NOTE: intent is not necessarily to represent common ancestry) more on this distinction later Cladogram Phylogram
Molecular Systematics Rooted vs. unrooted trees Rooted trees have a node from which all other nodes have descended It is directional Allow for the inference of ancestor-descendant relationships Unrooted trees lack a root, direction and indications of ancestral relationships
Molecular Systematics Rooted vs. unrooted trees Rooted trees have a node from which all other nodes have descended A B C Root D Rooted tree C D Root A B
Molecular Systematics Rooted vs. unrooted trees Two major ways to root a tree By outgroup: Use taxa (the “outgroup”) that are known to fall outside of the group of interest (the “ingroup”). Requires prior knowledge about the relationships among the taxa. outgroup By midpoint: Roots the tree at the midway point between the two most distant taxa in the tree, as determined by branch lengths. Assumes that the taxa are evolving in a clock-like manner. A d (A,D) = 10 + 3 + 5 = 18 Midpoint = 18 / 2 = 9 10 C 3 2 B 2 5 D
Molecular Systematics Rooted vs. unrooted trees Most phylogenetic methods infer unrooted trees Thus, choosing a root is an extremely important decision 5 potential roots to this one unrooted tree each one has a different interpretation
Molecular Systematics Rooted vs. unrooted trees Numbers of rooted and unrooted trees Possible Number of Number of OTU’s Rooted trees Unrooted trees 2 1 1 3 3 1 4 15 3 5 105 15 6 945 105 7 10395 945 8 135135 10395 Log # trees 9 2027025 135135 10 34459425 2027025 OTU’s
Molecular Systematics More terminology Tree -phyly Monophyletic groups – a group on a tree that includes one ancestor and the all terminal taxa that arose from it. Paraphyletic groups – A group of terminal taxa and ancestor(s) that excludes one or more members Polyphyletic – A completely unnatural grouping of terminal taxa
Molecular Systematics More terminology Tree -phyly Monophyletic – Archosauria, Lepidosauria Paraphyletic – “reptiles”, “dinosaurs” Polyphyletic – “ homeotherms”
Molecular Systematics Gene trees vs. species trees We usually assume that trees inferred from molecular data (sequences) reflect the history of the organisms. What happens when we assume? A B C A B C A B C A B C A B C + A B C
Molecular Systematics Incomplete lineage sorting We usually assume that trees inferred from molecular data (sequences) reflect the history of the organisms. What happens when we assume? Salem et al. 2003, PNAS
Molecular Systematics More terminology Characters and character states Organisms comprise sets of features A particular feature that is heritable is a “character” a nucleotide position, the shape of a bone, presence or absence of a bone When taxa differ with respect to a feature (e.g. the presence or absence or difference of a base at a particular locus) the different conditions are called “character states” Character states can be discrete or continuous, reversible or nonreversible, ordered or non-ordered, ancestral or derived (polarity) Character Possible states Nucleotide position A, T, C, G, gap TE insertion Presence, absence Amino acid Polar, nonpolar, acid, base, etc Mandibular symphysis Unfused, partially fused, ossified
Molecular Systematics Homology assignment All phylogenetic methods (molecular and morphological) assume that you are comparing homologous loci/structures Homologous – sharing a common ancestor Two loci are either homologous or not, there is no such thing as 95% homologous – 95% similar, yes; 95% homologous, no Homology comes in two flavors Paralogy – loci originating from a duplication event recent enough to reveal their common ancestry Orthology – loci that share ancestry via lineage divergence One must be able to discern the two a priori
Molecular Systematics More terminology Homoplasy Similarity that is not homologous (not due to common ancestry) Can be the result of convergence, parallelism, reversals of state Can provide misleading evidence of phylogenetic affinity (if interpreted incorrectly as homology) Common in DNA sequence data
Molecular Systematics Challenges to inferring trees with molecular data Paralogy
Molecular Systematics Challenges to inferring trees with molecular data Gene conversion
Molecular Systematics Challenges to inferring trees with molecular data Varying rates of mutation
Molecular Systematics Challenges to inferring trees with molecular data Horizontal gene transfer
Identity by Descent/State Identity By State time Species A Species A ATGGTCC Species B ATGATCC mutation insertion Species A Species A ATGGTCC Species B Species A’ Species B