Current Approaches to Whole Genome Phylogenetic Analysis Hongli Li.

Slides:



Advertisements
Similar presentations
The Cobweb of Life Revealed by Genome-Scale Estimates of Horizontal Gene Transfer By Fan Ge, Li-San Wang, Junhyong Kim Published: August 30, 2005 Presented.
Advertisements

A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks Luay Nakhleh Department of Computer Sciences UT Austin.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic reconstruction
Comparative genomics Joachim Bargsten February 2012.
Molecular Evolution Revised 29/12/06
© Wiley Publishing All Rights Reserved. Phylogeny.
Bioinformatics Chromosome rearrangements Chromosome and genome comparison versus gene comparison Permutations and breakpoint graphs Transforming Men into.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
The Statistical Significance of Max-gap Clusters Rose Hoberman David Sankoff Dannie Durand.
Adaptive evolution of bacterial metabolic networks by horizontal gene transfer Chao Wang Dec 14, 2005.
Bioinformatics and Phylogenetic Analysis
Tree Pattern Matching in Phylogenetic Trees Automatic Search for Orthologs or Paralogs in Homologous Gene Sequence Databases By: Jean-François Dufayard,
1 Genome Rearrangements João Meidanis São Paulo, Brazil December, 2004.
Inferring Phylogeny using Permutation Patterns on Genomic Data 1 Md Enamul Karim 2 Laxmi Parida 1 Arun Lakhotia 1 University of Louisiana at Lafayette.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Combinatorial and Statistical Approaches in Gene Rearrangement Analysis Jijun Tang Computer Science and Engineering University of South Carolina
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Molecular phylogenetics
Phylogenetic Analysis Dayong Guo. Introduction Phylogenetics is the study of evolutionary relatedness among various species, populations, or among a set.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Computer Science Research for The Tree of Life Tandy Warnow Department of Computer Sciences University of Texas at Austin.
Binary Encoding and Gene Rearrangement Analysis Jijun Tang Tianjin University University of South Carolina (803)
Genomics Lecture 8 By Ms. Shumaila Azam. 2 Genome Evolution “Genomes are more than instruction books for building and maintaining an organism; they also.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Yeast genome sequencing: the power of comparative genomics MEDG 505, 03/02/04, Han Hao Molecular Microbiology (2004)53(2), 381 – 389.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Anatomy of a Genome Project A.Sequencing 1. De novo vs. ‘resequencing’ 2.Sanger WGS versus ‘next generation’ sequencing 3.High versus low sequence coverage.
Announcements Urban Forestry project starts this week. Go through protocol. We'll be sending you off on your own. Please act responsibly. Peer review of.
Introduction to Phylogenetics
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution; Comparing whole genomes enhances – Our ability to understand.
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
GENE 3000 Fall 2013 slides wiki. wiki. wiki.
Phylogeny and Genome Biology Andrew Jackson Wellcome Trust Sanger Institute Changes: Type program name to start Always Cd to phyml directory before starting.
26.1 Organisms Evolve Through Genetic Change Occurring Within Populations. “Nothing in Biology makes sense except in the light of Evolution” –Theodosius.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Significance Tests for Max-Gap Gene Clusters Rose Hoberman joint work with Dannie Durand and David Sankoff.
Subtree Prune Regraft & Horizontal Gene Transfer or Recombination.
Classification.
Phylogeny & Systematics
The evolutionary history of a species or a group of species
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Building Phylogenies Maximum Likelihood. Methods Distance-based Parsimony Maximum likelihood.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Ch 23: Understanding Diversity: Systematics. phylogeny systematics taxonomy taxon.
Darwin’s Tree of Life, July million species Phylogenetic inference from genomic.
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Bioinformatics Overview
Evolutionary genomics can now be applied beyond ‘model’ organisms
Segmental, Componential, & Contextual Changes
Phylogenetic basis of systematics
Maximum Likelihood Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Tree of 68 Eukaryotes By: Yu Lin Fei Hu , Jijun Tang Bernard.
Phylogenetic Inference
Multiple Alignment and Phylogenetic Trees
Summary and Recommendations
Mattew Mazowita, Lani Haque, and David Sankoff
Gautam Dey, Tobias Meyer  Cell Systems 
Unit Genomic sequencing
Summary and Recommendations
Presentation transcript:

Current Approaches to Whole Genome Phylogenetic Analysis Hongli Li

Content Background Background Genome Evolution Genome Evolution Phylogenetic Analysis Phylogenetic Analysis Performing Statistical Tests Performing Statistical Tests Phylogenetic Networks Phylogenetic Networks Conclusion Conclusion

Phylogenetic Analysis Background Early attempts – Based on morphological characters Early attempts – Based on morphological characters Directly compare genes make more sense Directly compare genes make more sense Modern attempts – Using sequences from individual homologous genes Modern attempts – Using sequences from individual homologous genes A gene’s evolutionary history might not the same as the evolutionary history of its organisms A gene’s evolutionary history might not the same as the evolutionary history of its organisms Some genes that are sufficiently conserved across all interested species might not be identified Some genes that are sufficiently conserved across all interested species might not be identified

Genome Evolution Prokaryotes Prokaryotes Relatively Simple Relatively Simple Prokaryote evolutionary history cannot properly be represented by a tree Prokaryote evolutionary history cannot properly be represented by a tree Eukaryotes Eukaryotes More complicated More complicated Frequent inversions of small segments, gene duplication and loss and polyploidy events Frequent inversions of small segments, gene duplication and loss and polyploidy events Organellar Genomes Organellar Genomes Contain smaller and simpler mitochondrial genome Contain smaller and simpler mitochondrial genome Plant species have chloroplast genome Plant species have chloroplast genome

Genome Evolution (cont.) Model of Genome Evolution Model of Genome Evolution Nadeau – Taylor Model Nadeau – Taylor Model

Phylogenetic Analysis – Binary Character Encoding Binary Character Encoding Binary Character Encoding Encode the presence or absence of particular genes or protein families are obvious whereas gene order are not Encode the presence or absence of particular genes or protein families are obvious whereas gene order are not Many different approaches. Many different approaches. Nature restriction Nature restriction A gene cannot adjacent to more than two others A gene cannot adjacent to more than two others A evolutionary event will create two adjacent and break two A evolutionary event will create two adjacent and break two

Phylogenetic Analysis – Distance Methods Distance Methods Distance Methods Smallest number of evolutionary events between two gnomes Smallest number of evolutionary events between two gnomes Breakpoint Distance Breakpoint Distance The distance between two genome with unequal content is a problem The distance between two genome with unequal content is a problem There are several software available for distance analysis There are several software available for distance analysis

Phylogenetic Analysis – Maximum Parsimony Try to find minimum tree is NP-hard Try to find minimum tree is NP-hard Several attempts Several attempts Find “breakpoint phylogeny” – Easier to find the maximum parsimony tree but still NP-hard Find “breakpoint phylogeny” – Easier to find the maximum parsimony tree but still NP-hard Try to find the true maximum parsimony with improved algorithms and computing power Try to find the true maximum parsimony with improved algorithms and computing power Parsimony method has more advantages compared to distance methods Parsimony method has more advantages compared to distance methods But difficult to measure the accuracy of solutions But difficult to measure the accuracy of solutions

Phylogenetic Analysis – Other Methods Maximum Likelihood Maximum Likelihood Computationally prohibitive Computationally prohibitive Method of Invariants Method of Invariants Relies on having good estimates for the invariant function, which requires large dataset Relies on having good estimates for the invariant function, which requires large dataset Bayesian Analysis Bayesian Analysis The probability distributions involved can become extremely complicated The probability distributions involved can become extremely complicated

Performing Statistical Tests Performing Statistical Tests for Phylogenetic features is not straight forward in any situation Performing Statistical Tests for Phylogenetic features is not straight forward in any situation Re-sampling methods should preserve the gene order and should be used with caution since new error might introduced Re-sampling methods should preserve the gene order and should be used with caution since new error might introduced

Phylogenetic Networks When dealing with whole genomes and in particular prokaryotic genomes we need phylogenetic networks When dealing with whole genomes and in particular prokaryotic genomes we need phylogenetic networks Split graphs Split graphs Reticulograms Reticulograms Can express uncertainty in a tree or a lack of faith in the tree model of evolution Can express uncertainty in a tree or a lack of faith in the tree model of evolution Not suitable for representing phenomena such as horizontal transfer or allopolyploid events Not suitable for representing phenomena such as horizontal transfer or allopolyploid events

Conclusion Comparison of gene content are becoming commonplace but comparison gene order present a wider range of problems Comparison of gene content are becoming commonplace but comparison gene order present a wider range of problems It is important to focus on the data we already or will have It is important to focus on the data we already or will have Methods for whole genome phylogenetic analysis need to be robust against missing or inaccurate information Methods for whole genome phylogenetic analysis need to be robust against missing or inaccurate information