Lecture 17: Phylogenetics and Phylogeography October 22, 2012
Announcements Exam Next Wednesday (Oct 31) Review on Monday Bring questions Covers material from genetic drift (Sept 28) through Coalescence (Friday) I will be gone Monday, Oct 29 (after office hours) through Oct 31 Bring questions on Monday!
Last Time Using FST to estimate migration Direct estimates of migration: parentage analysis Introduction to phylogenetic analysis
Today Phylogeography Limitations of phylogenetic analysis Coalescence introduction Influence of demography on coalescence time
Read the Lab 9 Introduction! UPGMA Method Use all pairwise comparisons to make dendrogram UPGMA:Unweighted Pairwise Groups Method using Arithmetic Means Hierarchically link most closely related individuals Read the Lab 9 Introduction!
Phenetics (distance) vs Cladistics (character state based) Lowe, Harris, and Ashton 2004
Parsimony Methods Based on underlying genealogical relationships among alleles Occam’s Razor: simplest scenario is the most likely Useful for depicting evolutionary relationships among taxa or populations Choose tree that requires smallest number of steps (mutations) to produce observed relationships
Choosing Phylogenetic Trees MANY possible trees can be built for a given set of taxa Very computationally intensive to choose among these Lowe, Harris, and Ashton 2004
Choosing Phylogenetic Trees Many algorithms exist for searching tree space Local optima are problem: need to traverse valleys to get to other peaks Heuristic search: cut trees up systematically and reassemble Branch and bound: search for optimal path through tree space 7 9 8 5 10 11 Felsenstein 2004
Choosing Phylogenetic Trees If multiple trees equally likely, select majority rule or consensus Strict consensus is most conservative approach Bootstrap data matrix (sample with replacement) to determine robustness of nodes Felsenstein 2004 Lowe, Harris, and Ashton 2004 E A C B D F 60
Phylogeography The study of evolutionary relationships among individuals based on phylogenetic analysis of DNA sequences in geographic context Can be used to infer evolutionary history of populations Migrations Population subdivisions Bottlenecks/Founder Effects Can provide insights on current relationships among populations Connectedness of populations Effects of landscape features on gene flow
Phylogeography Topology of tree provides clues about evolutionary and ecological history of a set of populations Dispersal creates poor correspondence between geography and tree topology Vicariance (division of populations preventing gene flow among subpopulations) results in neat mapping of geography onto haplotypes
Example: Pocket gophers (Geomys pinetis) Avise 2004 Fossorial rodent that inhabits 3-state area in the U.S. RFLP for mtDNA of 87 individuals revealed 23 haplotypes Parsimony network reveals geographic relationships among haplotypes Haplotypes generally confined to single populations Major east-west split in distribution revealed
Problems with using Phylogenetics for Inferring Evolution It’s a black box: starting from end point, reconstructing past based on assumed evolutionary model Homologs versus paralogs Hybridization Differential evolutionary rates Assumes coalescence
Gene Orthology Phylogenetics requires unambiguous identification of orthologous genes Paralogous genes are duplicated copies that do not share a common evolutionary history Difficult to determine orthology relationships Lowe, Harris, Ashton 2004
Gene Trees vs Species Trees Genes (or loci) evolve at different rates Why? Topology derived by a single gene may not match topology based on whole genome, or morphological traits A C B Gene Tree
Gene Trees vs Species Trees a b c Concordant Gene Tree b is closer to a than to c Gene Trees vs Species Trees Failure to coalesce within species lineages drives divergence of relationships between gene and species trees a b c Divergent Gene Tree: b is closer to c than to a
Coalescence Retrospective tracing of ancestry of individual alleles Allows explicit simulation of sequence evolution Incorporation of factors that cause deviation from neutrality: selection, drift, and gene flow
9 generations in the history of a population of 14 gene copies Slide courtesy of Yoav Gilad Time present Individual alleles
How to model this process?
Modeling from Theoretical Ancestors: Forward Evolution Can model populations in a forward direction, starting with theoretical past Fisher-Wright model of neutral evolution Very computationally intensive for large populations
Alternative: Start at the end and work your way back Most recent common ancestor (MRCA) Time present Individual alleles Slide courtesy of Yoav Gilad
The genealogy of a sample of 5 gene copies Most recent common ancestor (MRCA) Time present individuals Slide courtesy of Yoav Gilad
The genealogy of a sample of 5 gene copies Most recent common ancestor (MRCA) Time present Individual alleles Slide courtesy of Yoav Gilad
Examples of coalescent trees for a sample of 6 Time Individual alleles Slide courtesy of Yoav Gilad
Coalescence Advantages Don’t have to model dead ends Only consider lineages that survive to modern day: computationally efficient Based on actual observations Can simulate different evolutionary scenarios to see what best fits the observed data
Coalescent Tree Example Coalescence: Merging of two lineages in the Most Recent Common Ancestor (MRCA) Waiting Time: time to coalescence for two lineages Increases with each coalescent event
Probability of Coalescence For any two lineages, function of population size Also a function of number of lineages where k is number of lineages
Probability of Coalescence Probability declines over time Lineages decrease in number Can be estimated based on negative exponential where k is number of lineages
Time to Coalescence Affected by Population History Bottleneck
Time to Coalescence Affected by Population History Population Growth
Time to Coalescence Affected by Population Structure
Applications of the Coalescent Approach Framework for efficiently testing alternative models for evolution Inferences about effective population size Detection of population structure Signatures of selection (coming attraction)