Some basics: Homology = refers to a structure, behavior, or other character of two taxa that is derived from the same or equivalent feature of a common ancestor. Homology applies to nucleotide sequences: g t c c c a t g t c t c a t A substitution has occurred at position 4.
Sequence alignment: Rapid sequence divergence or divergence over many generations can leave little in common between two sequences and make alignment difficult or impossible. Indels ~ may be impossible to distinguish between an insertion in one sequence and a deletion in another sequence. example: mtDNA 12S rRNA in six genera of “stifftail” ducks CCACCT-GT---TTCAAAA-CTCAGGCCTT TCACCTAGC---TCCAAA--C-TAGGCCTT CTGCCT-AC---TTCCC---C-CAGGCCTT TCGCCT-AC---T-CAA---C-CAGGCTTT TCGCCT-ACATTTTCCC---C-CAGGCTTT Many alignment methods exist; all use algorithms that seek to maximize the number of possible matching nucleotides or amino acids and minimize the number of indels.
Maximum Parsimony Occam’s Razor Entia non sunt multiplicanda praeter necessitatem. William of Occam ( ) The best tree is the one which requires the least number of substitutions
outgroup taxon 1 taxon 2 outgrouptaxon 1 taxon 2 Most parsimonious Less parsimonious
Assumptions: independence Assumes that change at one site has no effect on other sites Good example is in RNA stem-loop structuresACCCCUUGC A U G GGGGAA Substitution may result in mismatched bases and decreased stem stability A CCC C UU G C A U G GGG C AA A CCCGUU G C A U G GGGCAA Compensatory change may occur to restore Watson- Crick base pairing
Types of substitution Substitutions that exchange a purine for another purine or a pyrimidine for another pyrimidine are called transitions A A T T G G C C Substitutions that exchange a purine for a pyrimidine or vice-versa are called transversions
Differing rates of DNA evolution Functional constraints (particular features of coding regions, particular features in 5' or 3’ untranslated regions) Variation among different gene regions with different functions (different parts of a protein may evolve at different rates). Within proteins, variations are observed between –surface and interior amino acids in proteins (order of magnitude difference in rates in haemoglobins) –charged and non-charged amino acids –protein domains with different functions –regions which are strongly constrained to preserve particular functions and regions which are not –different types of proteins -- those with constrained interaction surfaces and those without
Assumptions: variation in substitution rate across sites All sites are not equally likely to undergo a substitution 5’ flanking region 5’ untranslated region Non-degenerate sites Twofold degenerate sites Fourfold degenerate sites Introns 3’ untranslated region 3’ flanking region Pseudogenes Substitution / site / 10 9 years
Rates of macromolecular evolution
Homology Orthologs –Divergence (sequence change) follows speciation –Similarity can be used to construct phylogeny –Multiple orthologs can be present Paralogs –Divergence follows duplication Xenologs –Horizontal interspecies transfer of genes Homoplasy (similarity not due to common ancestry) –Parallelism –Convergence –Reversal
Orthologs and paralogs
Homology vs. Homoplasy Homology: similar traits inherited from a common ancestor Homoplasy: similar traits are not directly caused by common ancestry (convergent evolution). X X XX
Unique and unreversed characters - Hair Because hair evolved only once and is unreversed it is homologous and provides unambiguous evidence for the clade Mammalia Lizard Frog Human Dog HAIR absent present change or step
Homoplasy - independent evolution - Tails Human Lizard Frog Dog TAIL absent prese nt Loss of tails evolved independently in humans and frogs - there are two steps on the true tree
Homoplasy - misleading evidence of phylogeny If misinterpreted as homology, the absence of tails would be evidence for a wrong tree grouping humans with frogs and lizards with dogs Human Frog Lizard Dog TAIL absent present
Homology: limb structure
Homoplasy: wings
Homoplasy in molecular data Incongruence and therefore homoplasy can be common in molecular sequence data One reason is that characters have a limited number of alternative character states ( e.g. A, G, C and T) In addition, these states are chemically identical so that homology and homoplasy are equally similar and cannot be distinguished through detailed study of structure or development
A Parallel 2 changes, no difference ACAC A Coincidental 2 changes, 1 difference AC AG A Single 1 change, 1 difference AC A Back 2 changes, no difference AC CA A Convergent 3 changes, no difference AC CT AT A Multiple 2 changes, 1 difference AC CT Types of substitution ACATGCCCTTAA
Homoplasy - reversal Reversals are evolutionary changes back to an ancestral condition As with any homoplasy, reversals can provide misleading evidence of relationships True tree Wrong tree