Download presentation
Presentation is loading. Please wait.
1
Phylogenetics
2
Review Multiple sequence alignment ClustalW
Steps 1. pairwise alignments 2. UPGMA or Neighbor-Joining tree based on pairwise scores (guide tree) 3. Multiple alignment informed by guide tree
3
Multiple sequence alignment (Phylip format)
ThNM012b TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThlanugQH0 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM TTCCGCCGGG GGGGTNGTCC CNNGGCTCGG TGTGCCCCCG GGGCCCGTGC ThNM TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCGCGTGC ThNM CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCGCGTGC Talthermo CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCGCGTGC ThNM CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCCCGTGC ThNM CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCCCGTGC AfumHQ6310 GGCCGCCGGG GAGGC-CTTG CGC CCCC- GGGCCCGCGC ThNM0026A -GCCGCCGGG GAGGC-CTTG CGC CCCC- GGGCCCGCGC ThNM025a -GCCGCCGGG GAGGC-CTTG CGC CCCC- GGGCCCGCGC Aspni GCCGCCGGG GGGGCGCCTC TGC CCCCC GGGCCCGTGC ThNM CCGCCGGG GGGCGTGTCC CGC CCCC- GGGCCCGCGC ThaurantT8 --CCGCCGGG GGGCGTGTCC CGC CCCC- GGGCCCGCGC
5
Newick tree format ( Physella_anatina/gb|AY651175.:0.00993,
Physa_heterostropha/gb|AY6511: , Physa_acuta/gb|AY |: ) : ) : , Physella_virgata/gb|AY : , Lymnaea_stagnalis/gb|EF : , Lymnaea_neotropica/emb|AM49400: ) : , Biomphalaria_glabrata/gi|34538: ) : );
10
Branch length may or may not reflect distance or time
11
A B C A B C A B C Three possible rooted trees with three taxa
(unrooted tree has no meaning with 3 taxa)
12
A B A C A D A B C D C D B D C B Four possible unrooted trees with four taxa
13
Characters and character states
Important terms Characters and character states Ancestral versus derived (not primitive versus advanced)
14
Invertebrate Fish Humans Dogs Birds animals
Some examples of ancestral and derived characters Invertebrate Fish Humans Dogs Birds animals Upright posture loss of body hair feathers and feathered wings bony limbs bony skeleton nervous system Note: Ancestral and derived are relative terms. In this tree, a character is ancestral to nodes higher in the tree, but derived with respect to nodes Lower in the tree.
15
The problem of taxonomy not reflecting phylogeny
16
polyphyletic groups contain taxa that are not derived from a single common ancestor
Old groups “algae” and “fungi” were polyphyletic Brown Algae Oomycete Fungi Green Plants Green Algae True Fungi animals
17
paraphyletic (subset of polyphyletic) groups have a taxonomic group contained within another group of equal status
18
Old “Reptilian” class is paraphyletic
Crocodiles Birds Lizards
19
Estimating phylogeny (phylogenetics)
Distance methods (not “phylogenetic”) - Examples: UPGMA, Fitch-Margolish, Neighbor Joining - Begin with a single measure of similarity or distance for every pair of taxa Phylogenetic methods (“phylogenetic”) - Examples: parsimony, maximum likelihood, Bayesian (Mr.Bayes) - Look at multiple discrete characters and use differences among character states to infer phylogeny
20
Distance (also sometimes called numerical) methods rely on a single numerical value that expresses the difference or similarity for any given pairwise comparison. With DNA data this value is usually obtained by dividing number of matching nucleotide positions by the average length of the two sequences compared. Species 1: CTGATCCGAGGTCAACCTTGGGTT-GTGAAGGTCGTTTTACGGCTGGAAC |||||||||||||||||||||| | | ||||||||||||||||||||||| species 2: 562 CTGATCCGAGGTCAACCTTGGGGTCGCGAAGGTCGTTTTACGGCTGGAAC 513
22
Estimating phylogeny (phylogenetics)
Distance methods (not “phylogenetic”) - Examples: UPGMA, Fitch-Margolish, Neighbor Joining - Begin with a single measure of similarity or distance for every pair of taxa Phylogenetic methods (“phylogenetic”) - Examples: parsimony, maximum likelihood, Bayesian (Mr.Bayes) - Look at multiple discrete characters and use differences among character states to infer phylogeny
24
Character #2 - + - + - - + + - + + -
A B C D A C B D A D B C Tree # Tree # Tree #3
25
Some potential pitfalls with molecular data
26
Paralogs versus Orthologs
Orthologs - homologous genes that reflect speciation Paralogs - homologous genes that reflect gene duplication = members of a gene family in a single organism (examples: alpha versus beta hemoglobin; red versus green visual pigment proteins Important to distinguish between these when doing comparative analyses (It’s sometimes hard to tell)
27
The Problem of Multiple Hits
29
Among other problems, this causes “long-branch attraction”
30
Scoring in phylogenetic methods is model dependent
31
General idea applies to protein amino-acid sequences as well
32
Can convert to scoring matrix based on
log probablilities
33
Physa_hete AA CATTATATTT Physa_acut AC CATTATATTT Physella_a AA CATTATATTT Physella_v AA CATTATATTT Lymnaea_st TTTAT Lymnaea_ne GATATTGGTA CTTTATATAT Biomphalar TTGCGTTGAC TCTTTTCAAC AAACCATAAA GATATTGGTA CTTTGTACAT AATTTTTGGG ATCTGGTGTG GATTGGTCGG TACAGGTTTA AGCTTGTTAA AATTTTTGGT GTTTGATGCG GTTTAGTGGG AACAGGTTTA TCCTTATTAA AATCTTTGGA ATCTGATGCG GGTTAGTAGG GACTGGATTG TCTTTATTAA AATTTTTGGA ATTTGGTGTG GTCTAGTTGG TACTGGATTA TCATTATTGA etc.
34
Bootstrap Analysis
35
Physa_hete AA CATTATATTT Physa_acut AC CATTATATTT Physella_a AA CATTATATTT Physella_v AA CATTATATTT Lymnaea_st TTTAT Lymnaea_ne GATATTGGTA CTTTATATAT Biomphalar TTGCGTTGAC TCTTTTCAAC AAACCATAAA GATATTGGTA CTTTGTACAT AATTTTTGGG ATCTGGTGTG GATTGGTCGG TACAGGTTTA AGCTTGTTAA AATTTTTGGT GTTTGATGCG GTTTAGTGGG AACAGGTTTA TCCTTATTAA AATCTTTGGA ATCTGATGCG GGTTAGTAGG GACTGGATTG TCTTTATTAA AATTTTTGGA ATTTGGTGTG GTCTAGTTGG TACTGGATTA TCATTATTGA etc.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.