Methods of molecular phylogeny

Slides:



Advertisements
Similar presentations
Viral Evolution and Recombination Peter Norberg
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
An Introduction to Phylogenetic Methods
Wellcome Trust Workshop Working with Pathogen Genomes Module 6 Phylogeny.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Plant Molecular Systematics (Phylogenetics). Systematics classifies species based on similarity of traits and possible mechanisms of evolution, a change.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
© Wiley Publishing All Rights Reserved. Phylogeny.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Bioinformatics and Phylogenetic Analysis
BME 130 – Genomes Lecture 26 Molecular phylogenies I.
Probabilistic methods for phylogenetic trees (Part 2)
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic Analysis
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
1 Summary on similarity search or Why do we care about far homologies ? A protein from a new pathogenic bacteria. We have no idea what it does A protein.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Phylogenetic Prediction Lecture II by Clarke S. Arnold March 19, 2002.
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny Ch. 7 & 8.
Phylogeny & Systematics
Phylogenetic Trees - Parsimony Tutorial #13
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Phylogeny and the Tree of Life
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Distance based phylogenetics
Phylogeny and the Tree of Life
Phylogenetic Inference
Schedule Cultural connection Introduction to evolution
Multiple Alignment and Phylogenetic Trees
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Patterns in Evolution I. Phylogenetic
Systematics: Tree of Life
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Systematics: Tree of Life
Phylogeny and the Tree of Life
18.2 Modern Systematics I. Traditional Systematics
Chapter 19 Molecular Phylogenetics
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Phylogeny and the Tree of Life
Phylogenetics Chapter 26.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Unit Genomic sequencing
Presentation transcript:

Methods of molecular phylogeny Peter Norberg (Peter.norberg@gu.se)

Content Introduction to Evolution and taxonomy Phylogenetic analysis Algorithmics Applied phylogenetics Computer Software Practical session

Evolution Charles Darwin ”Tree of life” Phylogenetic tree Root = Ancestor to all species

Rooted or unrooted trees? Trees show evolutionary relationships The root shows direction

Different representations B C D A B C D A B C D A B C D A B C D

Trees can be based on: Outer appearances (example shape of bills) Functionality Complexity A combination of… ……….. ….. DNA, RNA, AA, gene order….

Phylogenetic trees based on DNA AATTGGCC AATAGGCC AATAGGCA AGTTGGCG AATAGGAC AATAGGCA AGTTGGCG TATTGGCG AATAGGAC TATTGGCG AATTGGCG

Phylogenetic trees based on DNA AATTGGCC AATAGGCC AATAGGAC AATTGGCG AGTTGGCG TATTGGCG AATAGGCA AATAGGAC AATAGGCA AGTTGGCG TATTGGCG

Genomic region Same genomic region for all taxa! Not too similar Not too diverged Insertions/deletions

Sequence alignment Aligned: Not aligned: (1) AATGGCAACCGCATTCAGGATTTAA (3) ATGGTAACCGCATTGAGGATTTAA (2) AATGGTAACCGCAAGGATTTAA (5) TGGTAACCGCATTCAGGAATTAA (4) AATGGTAACCGCATTCAGGAATTA Aligned: Not aligned: (1) AATGGCAACCGCATTCAGGATTTAA (1) AATGGCAACCGCATTCAGGATTTAA (2) AATGGTAACCGCAA GGATTTAA (2) AATGGTAACCGCAAGGATTTAA (3) ATGGTAACCGCATTGAGGATTTAA (3) ATGGTAACCGCATTGAGGATTTAA (4) AATGGTAACCGCATTCAGGAATTA (4) AATGGTAACCGCATTCAGGAATTA (5) TGGTAACCGCATTCAGGATTTAA (5) TGGTAACCGCATTCAGGATTTAA

Sequence alignment, our example AATTGGCC AATAGGCC AATAGGCA AGTTGGCG AATAGGAC TATTGGCG AATTGGCG AATTGGCC AATTGGCC AATAGGCC AATAGGCC AATTGGCG AATTGGCG AATAGGAC AATAGGAC AGTTGGCG AGTTGGCG TATTGGCG TATTGGCG AATAGGCA AATAGGCA

Phylogenetic principles Similar DNA sequences = closely related Inherited mutations. Simplest “route”! Homoplasy unlikely (not always true).

Homology vs. homoplasy Homology = similarity due to a common ancestor Homoplasy = similarity due to convergent evolution, but independent origins

Algorithms for constructing phylogenetic trees What is an algorithm? Several different phylogenetic algorithms exist. How do they work?

Algorithms for constructing phylogenetic trees Distance matrices Neighbour Joining UPGMA Maximum Parsimony Maximum Likelihood Bayesian inference

Distance matrices Based on the genetic distance Genetic distance based on nucleotide substitutions Typically # of differences / totalt # of nt AATTCCGG AATACCGG AATTAATG 1 2 3 1 0 2 1 0 3 3 4 0 1 2 3 1 0 2 0.125 0 3 0.375 0.5 0

Neighbour Joining Cluster in pairs Shortest distance first => Similar sequences located closely together in the tree Fast algorithm! 1 2 3 1 0 2 0.125 0 3 0.375 0.5 0 2 1 3 A B C D

Maximum Parsimony Utilizes so-called informative sites. Simplest path (fewest mutations) Build all possible trees. Choose the tree, which requires the fewest mutations Relatively fast

Maximum Parsimony, example 1 2 3 4 a 1 2 3 4 a AATTCC AAGTCC AAGTCT 1 3 2 4 a a a 1 2 4 3 a a 1 2 3 4 a 1 2 3 4 a 1 4 2 3 a

Maximum Likelihood and Bayesian inference Statistical method including an evolutionary model Summarize the likelihood for all columns Calculate the likelihood for all possible trees Good but slow! Bayesian inference faster

To test all possible trees Is it possible? => Takes too much time!!!! To analyze 20 taxa gives ~1022 different possible trees (10.000.000.000.000.000.000.000) What to do? => Use sophisticated algorithms to limit the search space….. Usually produce good results, but not necessarily the best

To root an unrooted tree Include an “outgroup” Outgroup = more distantly related (but not too distantly) Place the root where the outgroup connects to the tree

Rooting a tree outgroup A F B D A F C D B C E E G G

Significance Is the tree reliable? Is it the only probable? Bootstrap, Jack knife etc.

Bootstrap Construct several new sequence sets (1000 st.) A new sequence set is generated by randomly picking of columns from the original set Apply the phylogenetic algorithm on all sets. Make one consensus tree from all trees

Bootstrapping A: AACTTAACCACGCTATCGATGCAATTATATA B: AATTTGACTGCGGTACCGATCCAATTATATA C: AATTTGACTGGGCTACCGATCCAATTATATA D: AACTTAACCGCGCTACTGATCGAATTATATA A: CACC B: TGCT C: TGCT D: CAGC A D B C A C B D A B C D 96 1 3 96 1 3

Pitfalls? Homoplasy (convergent evolution) - Selection pressure Hyper variable regions Random events Gene duplication Recombination - Different regions have different ancestries

Recombination A B Recombination Recombinants

Detection of recombinants H X C A D E H B I F G

Detection of recombinants H X A B C D E F G H I A B C D E F G H I

Phylogenetic networks A B C D A B C D R A B C D R A B C D R A B C D R

Applied phylogenetics Reconstruct evolutionary history Animals, plants, bacteria, viruses, plasmids, …… Establish evolutionary mechanisms Functional studies Trace pandemic diseases Forensic medicine

Examples

Practical session

Phylip Software package for phylogenetic analysis Several small (command-line) applications Many different algorithms Widely used by the scientific community seqboot -> Constructs bootstrap sets dnapars -> Constructs a maximum parsimony tree consence -> Constructs a consensus tree drawtree -> Draws the tree

Herpes Simplex Virus Type 1 & 2 Usually asymptomatic Cause oral and genital lesions, encephalitis, meningitis and keratitis Transferred via direct contact Life long infection in the sensorial ganglia HSV-1: 70-80%, HSV-2: 20-30% ~100 nm in diameter. Capsid surrounded by envelope. Different glycoproteins in envelope. Photo by Linda M. Stannard, University of Cape Town.

HSV-1 US7 (Glycoprotein I)

Clinical samples