Some sticky issues Short branches and ortholog/paralog inference

Slides:



Advertisements
Similar presentations
Genes, gene families, and genomes How does genome evolution relate to development and paleontology? Understanding how genomes evolve How do we use genomic.
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Reading Phylogenetic Trees Gloria Rendon NCSA November, 2008.
Introduction to Phylogenies
Gramene Comparative & Phylogenomics Resources for Plants Joshua C. Stein 1, William Spooner 1, Sharon Wei 1, Liya Ren 1, Doreen Ware 1,2 1 Cold Spring.
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
Tree of Life Chapter 26.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Comparative genomics Joachim Bargsten February 2012.
Xenolog: Homologs resulting from horizontal gene transfer.
UPGMA and FM are distance based methods. UPGMA enforces the Molecular Clock Assumption. FM (Fitch-Margoliash) relieves that restriction, but still enforces.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Parsimony methods the evolutionary tree to be preferred involves ‘the minimum amount of evolution’ Edwards & Cavalli-Sforza Reconstruct all evolutionary.
Phylogenetic trees Sushmita Roy BMI/CS 576
The diversity of genomes and the tree of life
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Bioinformatic Tools for Comparative Genomics of Vectors Comparative Genomics.
Using BLAST for Genomic Sequence Annotation Jeremy Buhler For HHMI / BIO4342 Tutorial Workshop.
Simple examples of the Bayesian approach For proportions and means.
Modern Evolutionary Classification Chapter The Problem with the Linnaeus System Linnaeus classified organisms based on overall similarities and.
Phylogeny Ch. 7 & 8.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Phylogenetics.
Phylogeny & Systematics
Tuesday 12/15/15 Learning Goal: Describe evidence that supports the theory of evolution. Warm-up: If two organisms look very similar during their early.
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Chapter 6 Section 2 Evidence of Evolution. Does natural selection occur today? YES! Cockroaches in a building…
1 Dan Graur Molecular Phylogenetics. 2 Objectives of molecular phylogenetics Reconstruct the correct evolutionary relationships among biological entities.
The family tree of life – everything is connected.
Year 9 Mathematics Algebra and Sequences
Phylogeny and the Tree of Life
Phylogenetic Trees - Parsimony Tutorial #12
Introduction to Bioinformatics Resources for DNA Barcoding
The Evolutionary history of related organisms
Section 1: Mutation and Genetic Change
To navigate the PowerPoint presentation place the mouse arrow inside the box containing the slide and click on the mouse to advance the program. Or, use.
Comparative Genomics.
Pipelines for Computational Analysis (Bioinformatics)
Thursday, October Writing assignment: (Darwinism.
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Endeavour to reconstruct the characters of each hypothetical ancestor.
Evolutionary history of related organisms
The Tree of Life From Ernst Haeckel, 1891.
Evidence of Evolution Chapter 6 Section 2.
Phylogenetic Trees.
Molecular Evolution.
Summary and Recommendations
Dr Tan Tin Wee Director Bioinformatics Centre
What Inquiry Skills Do Scientists Use?
Lecture 36 Section 12.2 Mon, Apr 23, 2007
Reading Phylogenetic Trees
Conservation in Evolution
click your mouse or hit enter to advance animation
Volume 16, Issue 18, Pages (September 2006)
click your mouse or hit enter to advance animation
6.2 Evidence of Evolution Key concepts: What evidence supports the theory of evolution? How do scientists infer evolutionary relationships among organisms?
Fixed- Point Iteration
Summary and Recommendations
Neighbor-joining distance tree based on Hsp90 sequences indicating that the cytosolic and ER resident forms of these protein form paralogous gene families,
Presentation transcript:

Some sticky issues Short branches and ortholog/paralog inference Short branches and rearrangement Species tree Multiple roots Let’s first review how unrooted trees lead to uncertainty.

Very short branches imply non-binary nodes FA HA1 MA1 HA2 CA2 CA1 Sometimes, the sequence data does not provide enough information to be able to determine what the true, binary branch pattern is at some node and so the reconstruction program will produce such a node to represent uncertainty. FA HA1 MA1 HA2 CA2 CA1

Alternate Hypotheses FA HA1 MA1 HA2 CA2 CA1 FA CA1 MA1 HA1 CA2 HA2 FA Here is what I mean by the true, binary branching pattern. With this tree, the sequence data does not provide enough information on exactly how chicken, mouse, and human A1 are related. The non-binary node that joins these three actually represents three different binary trees. (Click) In the first case, the Chicken and Mouse A1 gene are more closely related than either is to the Human A1 gene. Can anyone guess another possibility? (Yes/no. click). It is also possible that the Chicken and Human A1 genes are more closely related; or that the Human and Mouse are more closely related. Thus, a non-binary node indicates uncertainty in which possible series events occurred. This example showed a trifurcation (or non-binary node with 3 children) and had only 3 distinct possibilities. But this node could have had 4 or even 5 children. How many different hypotheses would that non-binary node then represent? CA1 MA1 HA1 CA2 HA2 FA HA1 MA1 HA2 CA2 CA1 FA HA1 MA1 HA2 CA2 CA1

Each alternate hypothesis yields a different set of orthologs FA HA1 MA1 HA2 CA2 CA1 D Orthologs: (MA1,CA1) Paralogs: (HA1,CA1), (HA1,MA1) Orthologs: (HA1,CA1) Paralogs: (MA1,CA1), (HA1,MA1) Here is what I mean by the true, binary branching pattern. With this tree, the sequence data does not provide enough information on exactly how chicken, mouse, and human A1 are related. The non-binary node that joins these three actually represents three different binary trees. (Click) In the first case, the Chicken and Mouse A1 gene are more closely related than either is to the Human A1 gene. Can anyone guess another possibility? (Yes/no. click). It is also possible that the Chicken and Human A1 genes are more closely related; or that the Human and Mouse are more closely related. Thus, a non-binary node indicates uncertainty in which possible series events occurred. This example showed a trifurcation (or non-binary node with 3 children) and had only 3 distinct possibilities. But this node could have had 4 or even 5 children. How many different hypotheses would that non-binary node then represent? All three are orthologs FA HA1 MA1 HA2 CA2 CA1 D FA HA1 MA1 HA2 CA2 CA1

Some sticky issues Short branches and ortholog/paralog inference Short branches and rearrangement Species tree Multiple roots Let’s first review how unrooted trees lead to uncertainty.

Rearrangement will prefer hypothesis 1, but is that a better choice? CA1 HA2 CA2 HA1 FA CA1 HA2 CA2 HA1 FA Rearrangement will prefer hypothesis 1, but is that a better choice? In this case, the rooted gene would be this. Notice that this rooting dramatically changes the interpretation of the data; however, the rooted tree still has the same relationships between the genes as the unrooted tree. For example, the A1 subfamily still groups together in both trees. So how many different rooted trees are there? (Get answer from audience). D D Hypothesis 1: FA HA1 HA2 CA2 CA1 Hypothesis 2: FA HA1 HA2 CA2 CA1

Some sticky issues Short branches and ortholog/paralog inference Short branches and rearrangement Species tree Multiple roots Let’s first review how unrooted trees lead to uncertainty.

Baldauf, Science 2003

General agreement on unrooted Eukaryotic tree Dictyostelium Animals Fungi Plasmodium Plants Here again is the gene tree created from the multiple alignment of those 6 genes – one from fish and mouse, two from chicken in human. If we knew what really happened, we’d know where the root is, we’d know the real order of events. There are many possible locations for the root in this gene tree. One hypothesis is that the root is located somewhere along this edge, leading to the Fish A gene. If so, This would be the rooted gene tree. However, another hypothesis may be that the root is really located along this edge, leading to the chicken A2 gene.

Cavalier-Smith’s root

Arisue’s root

Some sticky issues Short branches and ortholog/paralog inference Short branches and rearrangement Species tree Multiple roots Let’s first review how unrooted trees lead to uncertainty.

Beta Crystalin family CrybB2 Notung: All bold branches have the same score CrybA2 Crybb2 CrybA4 CrybA3 CrybB1 CrybA1 Ruvinsky and Silver, Genomics, 1997

Beta Crystalin family acidic basic Ruvinsky and Silver: Root provided by biochemical information acidic basic A2 A3 A1 B3 B1 A4 B2 This example suggests that we can devise a mathematical cost function for scoring possible rootings based on this notion of duplication and loss.