Download presentation
Presentation is loading. Please wait.
Published byMelvyn Willis Modified over 9 years ago
1
CSCE555 Bioinformatics Lecture 12 Phylogenetics I Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page: http://www.scigen.org/csce555 University of South Carolina Department of Computer Science and Engineering 2008 www.cse.sc.edu.www.cse.sc.edu HAPPY CHINESE NEW YEAR
2
Outline Introduction to Evolution What is phylogeny and phylogenetics Application of phylogenetics Algorithms for phylogenetic inference 10/28/20152
3
How did life evolve on earth? Courtesy of the Tree of Life project An international effort to understand how life evolved on earth Biomedical applications: drug design, protein structure and function prediction, biodiversity.
4
Evolution Evolution of new organisms is driven by Mutations ◦ The DNA sequence can be changed due to single base changes, deletion/insertion of DNA segments, etc. Selection bias
5
Theory of Evolution Basic idea ◦ speciation events lead to creation of different species. ◦ Speciation caused by physical separation into groups where different genetic variants become dominant Any two species share a (possibly distant) common ancestor
6
Primate evolution A phylogeny is a tree that describes the sequence of speciation events that lead to the forming of a set of current day species; also called a phylogenetic tree.
7
DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT AAGGCCTTGGACTT TAGCCCATAGACTTAGCGCTTAGCACAAAGGGCAT TAGCCCTAGCACTT AAGACTT TGGACTTAAGGCCT AGGGCATTAGCCCTAGCACTT AAGGCCTTGGACTT AGCGCTTAGCACAATAGACTTTAGCCCAAGGGCAT
8
Morphological vs. Molecular Classical phylogenetic analysis: morphological features: number of legs, lengths of legs, etc. Modern biological methods allow to use molecular features ◦ Gene sequences ◦ Protein sequences ◦ Whole genome sequences. E.g. rearrangements
9
Morphological topology Archonta Glires Ungulata Carnivora Insectivora Xenarthra (Based on Mc Kenna and Bell, 1997)
10
RatQEPGGLVVPPTDA RabbitQEPGGMVVPPTDA GorillaQEPGGLVVPPTDA CatREPGGLVVPPTEG From sequences to a phylogenetic tree There are many possible types of sequences to use (e.g. Mitochondrial vs Nuclear proteins).
11
Perissodactyla Carnivora Cetartiodactyla Rodentia 1 Hedgehogs Rodentia 2 Primates Chiroptera Moles+Shrews Afrotheria Xenarthra Lagomorpha + Scandentia Mitochondrial topology (Based on Pupko et al.,)
12
Phylogenenetic trees Leaves - current day species (or taxa – plural of taxon) Internal vertices - hypothetical common ancestors Edges length - “time” from one speciation to the next AardvarkBisonChimpDogElephant
13
Types of Trees A natural model to consider is that of rooted trees Common Ancestor
14
Types of trees Unrooted tree represents the same phylogeny without the root node Depending on the model, data from current day species does not distinguish between different placements of the root.
15
Rooted versus unrooted trees Tree a a b Tree b c Tree c Represents the three rooted trees
16
What is phylogenetics? Phylogenetics is the study of evolutionary relationships among and within species. ◦ Inference of trees from data ◦ Interpreting the evolutionary tree ◦ Application of evolutionary trees crocodiles birds lizards snakes rodents primates marsupials
17
What is phylogenetics? crocodiles birds lizards snakes rodents primates marsupials This is an example of a phylogenetic tree.
18
Forensics: Did a patient’s HIV infection result from an invasive dental procedure performed by an HIV+ dentist? Applications of phylogenetics Conservation: How much gene flow is there among local populations of island foxes off the coast of California? Medicine: What are the evolutionary relationships among the various prion-related diseases? HIV case
19
Applications of phylogenetics 1. Forensics Did a patient’s HIV infection result from an invasive dental procedure performed by an HIV+ dentist?
20
Phylogenetic analysis
21
So what do the results mean? 2 of 3 patients closer to dentist than to local controls. Statistical significance? More powerful analyses? Do we have enough data to be confident in our conclusions? What additional data would help? If we determine that the dentist’s virus is linked to those of patients E and G, what are possible interpretations of this pattern? How could we test between them?
22
Applications of phylogenetics 2. Conservation How much gene flow is there among local populations of island foxes off the coast of California?
23
http://bioquest.org/bedrock/ Wayne, K. R, Morin, P.A. 2004 Conservation Genetics in the New Molecular Age, Frontiers in Ecology and the Environment. 2: 89-97. (ESA publication)
24
Applications of phylogenetics 3. Medicine What are the evolutionary relationships among the various prion-related diseases?
25
Inferring Phylogenies Trees can be inferred: ◦ Morphology of the organisms ◦ Sequence comparison Example: Orc: ACAGTGACGCCCCAAACGT Elf: ACAGTGACGCTACAAACGT Dwarf: CCTGTGACGTAACAAACGA Hobbit: CCTGTGACGTAGCAAACGA Human: CCTGTGACGTAGCAAACGA
26
How Many Trees? Unrooted treesRooted trees # sequences # pairwise distances# trees # branches /tree# trees # branches /tree 3 4 5 6 10 30 N (assuming bifurcation only)
27
How Many Trees? 2N - 2(2N - 3)! 2 N - 2 (N - 2)! 2N - 3(2N - 5)! 2 N - 3 (N - 3)! N (N - 1) 2 N 58 4.95 10 38 57 8.69 10 36 43530 1834,459,425172,027,0254510 9459105156 8105715105 6155364 433133 # branches /tree# trees # branches /tree# trees # pairwise distance s # sequence s Rooted treesUnrooted trees
28
Phylogenetic Methods Maximum likelihood Maximizes likelihood of observed data Many different procedures exist. Three of the most popular: Maximum parsimony Minimizes total evolutionary change Neighbor-joining Minimizes distance between nearest neighbors
29
Comparison of Methods Neighbor-joiningMaximum parsimonyMaximum likelihood Very fastSlowVery slow Easily trapped in local optima Assumptions fail when evolution is rapid Highly dependent on assumed evolution model Good for generating tentative tree, or choosing among multiple trees Best option when tractable (<30 taxa, strong conservation) Good for very small data sets and for testing trees built using other methods
30
Distance based tree Construction Distance- A weighted tree that realizes the distances between the objects. Given a set of species (leaves in a supposed tree), and distances between them – construct a phylogeny which best “fits” the distances.
31
Distance Matrix Given n species, we can compute the n x n distance matrix D ij D ij may be defined as the edit distance between a gene in species i and species j, where the gene of interest is sequenced for all n species.
32
Distances in Trees Edges may have weights reflecting: ◦ Number of mutations on evolutionary path from one species to another ◦ Time estimate for evolution of one species into another In a tree T, we often compute d ij (T) - the length of a path between leaves i and j
33
Distance in Trees: an Exampe d 1,4 = 12 + 13 + 14 + 17 + 12 = 68 i j
34
Fitting Distance Matrix Given n species, we can compute the n x n distance matrix D ij Evolution of these genes is described by a tree that we don’t know. We need an algorithm to construct a tree that best fits the distance matrix D ij
35
Summary Evolution and Phylogeny Concepts of Phylogenetics Application of Phylogenetics Category of phylogenetic inference algorithms Next lecture: Detailed algorithms for phylogenetic inference
36
Acknowledgement Anonymous authors
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.