What is of interest to calculate ? for open problems Semple and Steel.

Slides:



Advertisements
Similar presentations
Great Theoretical Ideas in Computer Science for Some.
Advertisements

Trees Chapter 11.
An introduction to maximum parsimony and compatibility
Population Genetics, Recombination Histories & Global Pedigrees Finding Minimal Recombination Histories Global Pedigrees Finding.
Combinatorics of Phylogenies for open problems Semple and Steel (2003)
Graph Isomorphism Algorithms and networks. Graph Isomorphism 2 Today Graph isomorphism: definition Complexity: isomorphism completeness The refinement.
CompSci 102 Discrete Math for Computer Science April 19, 2012 Prof. Rodger Lecture adapted from Bruce Maggs/Lecture developed at Carnegie Mellon, primarily.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
CMPS 2433 Discrete Structures Chapter 5 - Trees R. HALVERSON – MIDWESTERN STATE UNIVERSITY.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
Phylogeny Tree Reconstruction
Haplotyping via Perfect Phylogeny Conceptual Framework and Efficient (almost linear-time) Solutions Dan Gusfield U.C. Davis RECOMB 02, April 2002.
A tree is a simple graph satisfying: if v and w are vertices and there is a path from v to w, it is a unique simple path. a b c a b c.
Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees Yufeng Wu and Jiayin Wang Department of Computer Science and Engineering University.
My wish for the project-examination It is expected to be 3 days worth of work. You will be given this in week 8 I would expect 7-10 pages You will be given.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Discrete Math for CS Chapter 7: Graphs. Discrete Math for CS Map of Koenigsberg at the time of Euler.
MATH 310, FALL 2003 (Combinatorial Problem Solving) Lecture 11, Wednesday, September 24.
Phylogenetic Networks of SNPs with Constrained Recombination D. Gusfield, S. Eddhu, C. Langley.
Module #1 - Logic 1 Based on Rosen, Discrete Mathematics & Its Applications. Prepared by (c) , Michael P. Frank and Modified By Mingwu Chen Trees.
03/01/2005Tucker, Sec Applied Combinatorics, 4th Ed. Alan Tucker Section 3.1 Properties of Trees Prepared by Joshua Schoenly and Kathleen McNamara.
Discrete Mathematics Lecture 9 Alexander Bukharovich New York University.
KNURE, Software department, Ph , N.V. Bilous Faculty of computer sciences Software department, KNURE The trees.
Combinatorics & the Coalescent ( ) Tree Counting & Tree Properties. Basic Combinatorics. Allele distribution. Polya Urns + Stirling Numbers. Number.
Section 10.1 Introduction to Trees These class notes are based on material from our textbook, Discrete Mathematics and Its Applications, 6 th ed., by Kenneth.
Week 11 - Wednesday.  What did we talk about last time?  Graphs  Euler paths and tours.
Tree A connected graph that contains no simple circuits is called a tree. Because a tree cannot have a simple circuit, a tree cannot contain multiple.
DIAM About the number of vines on n nodes. TU Delft.
Binary Trees. Binary Tree Finite (possibly empty) collection of elements A nonempty binary tree has a root element The remaining elements (if any) are.
CSCI 115 Chapter 7 Trees. CSCI 115 §7.1 Trees §7.1 – Trees TREE –Let T be a relation on a set A. T is a tree if there exists a vertex v 0 in A s.t. there.
Trees & Topologies Chapter 3, Part 1. Terminology Equivalence Classes – specific separation of a set of genes into disjoint sets covering the whole set.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
GRAPHS THEROY. 2 –Graphs Graph basics and definitions Vertices/nodes, edges, adjacency, incidence Degree, in-degree, out-degree Subgraphs, unions, isomorphism.
Partitioning the Labeled Spanning Trees of an Arbitrary Graph into Isomorphism Classes Austin Mohr.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
ICS 253: Discrete Structures I Induction and Recursion King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Trees : Part 1 Section 4.1 (1) Theory and Terminology (2) Preorder, Postorder and Levelorder Traversals.
Graph Theory and Applications
Bijective tree encoding Saverio Caminiti. 2 Talk Outline Domains Prüfer-like codes Prüfer code (1918) Neville codes (1953) Deo and Micikevičius code (2002)
Counting II: Recurring Problems And Correspondences Great Theoretical Ideas In Computer Science John LaffertyCS Fall 2005 Lecture 7Sept 20, 2005Carnegie.
More Trees Discrete Structures (CS 173)
Lecture 17: Trees and Networks I Discrete Mathematical Structures: Theory and Applications.
Data Structures Lakshmish Ramaswamy. Tree Hierarchical data structure Several real-world systems have hierarchical concepts –Physical and biological systems.
Great Theoretical Ideas in Computer Science for Some.
Fixed Parameters: Population Structure, Mutation, Selection, Recombination,... Reproductive Structure Genealogies of non-sequenced data Genealogies of.
COMPSCI 102 Introduction to Discrete Mathematics.
Chapter 6 – Trees. Notice that in a tree, there is exactly one path from the root to each node.
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Recombination and Pedigrees Genealogies and Recombination: The ARG Recombination Parsimony The ARG and Data Pedigrees: Models and Data Pedigrees & ARGs.
Chapter 11. Chapter Summary Introduction to Trees Applications of Trees (not currently included in overheads) Tree Traversal Spanning Trees Minimum Spanning.
Minimal Recombinations Histories and Global Pedigrees Finding Minimal Recombination Histories Acknowledgements Yun Song - Rune Lyngsø - Mike Steel - Carsten.
Discrete Structures Li Tak Sing( 李德成 ) Lectures
Applied Discrete Mathematics Week 13: Graphs
Proof technique (pigeonhole principle)
Source Code for Data Structures and Algorithm Analysis in C (Second Edition) – by Weiss
12. Graphs and Trees 2 Summary
Introduction to Trees Section 11.1.
CHAPTER 4 Trees.
Taibah University College of Computer Science & Engineering Course Title: Discrete Mathematics Code: CS 103 Chapter 10 Trees Slides are adopted from “Discrete.
Lectures on Graph Algorithms: searching, testing and sorting
Recombination, Phylogenies and Parsimony
COMPS263F Unit 2 Discrete Structures Li Tak Sing( 李德成 ) Room A
GRAPHS Lecture 17 CS2110 Spring 2018.
Discrete Mathematics for Computer Science
Presentation transcript:

What is of interest to calculate ? for open problems Semple and Steel (2003) Phylogenetics Oxford University Press for summer projects The number of trees Operations on Trees Metrics on Trees Averages/Consensus of Trees Counting other genealogical structures Trees and Supertrees

Trees – graphical & biological. A graph is a set vertices (nodes) {v 1,..,v k } and a set of edges {e 1 =(v i1,v j1 ),..,e n =(v in,v jn )}. Edges can be directed, then (v i,v j ) is viewed as different (opposite direction) from (v j,v i ) - or undirected. Nodes can be labelled or unlabelled. In phylogenies the leaves are labelled and the rest unlabelled v1v1 v2v2 v4v4 v3v3 (v 1  v 2 ) (v 2, v 4 ) or (v 4, v 2 ) The degree of a node is the number of edges it is a part of. A leaf has degree 1. A graph is connected, if any two nodes has a path connecting them. A tree is a connected graph without any cycles, i.e. only one path between any two nodes.

Trees & phylogenies. A tree with k nodes has k-1 edges. (easy to show by induction).. Leaf Internal Node A root is a special node with degree 2 that is interpreted as the point furthest back in time. The leaves are interpreted as being contemporary. Leaf Root Internal Node A root introduces a time direction in a tree. A rooted tree is said to be bifurcating, if all non-leafs/roots has degree 3, corresponding to 1 ancestor and 2 children. For unrooted tree it is said to have valency 3. Edges can be labelled with a positive real number interpreted as time duration or amount or evolution. If the length of the path from the root to any leaf is the same, it obeys a molecular clock. Tree Topology: Discrete structure – phylogeny without branch lengths.

Enumerating Trees: Unrooted, leaflabelled & valency Recursion: T n = (2n-5) T n-1 Initialisation: T 1 = T 2 = T 3 =1

Number of leaf labelled phylogenies with arbitrary valencies Recursion: R n,k = (n+k-3) R n-1,k-1 + k R n-1,k Initialisation: R n,1 =1, R n,n-2 =T n n –number of leaves, k – number of internal nodes k n k=n-2 k=1 Felsenstein, 1979, Artemisa Labi (2007 – summer project

4--5 {1}{2}{3}{4}{5} (1,2)--(3,(4,5)) {1,2}{3,4,5} {1,2,3,4,5} WaitingCoalescing (4,5) {1}{2}{3}{4,5} 1--2 {1}{2}{3,4,5} Number of Coalescent Topologies Time ranking of internal nodes are recorded S 1 =S 2 =1 Bifurcating: Multifurcating:

Unlabelled counting: Sketch of method Rooted trees, ordered subtrees of arbitrary degree: T1T1 T T2T2 TkTk Let g n be the size of a class index by n – for instance number of trees with n nodes. The function is called the generating function and is central in counting trees and much more For certain recursive structures, the counting problem can be rephrased as functional equations in G If any combinatorial object a k from A n, can be written as (b i, c j ) [b i from B and c j from C]. Then G A =G B *G C, since a k =b 1 c k b k-1 c k b1c1b1c1 b2c1b2c1 b k-1 c 1 b1c2b1c2 b 1 c k-1 b2c2b2c2 Equivalent to set of nested parenthesis, who size is described by the Catalan numbers

Sketch of the problems: Multifurcations rooted trees, unordered subtrees T1T1 T T2T2 TkTk Since tree class can occur in multiplicities, counting must be done accordingly corresponding to the simple case in the bifurcating case where left and right subtree had the same size. n How many ways can n be partitioned? The above: i 3 j 2 k 1 [3i+2j+k=n] How many integers occur in multiplicities? Within a multiplicity, how many ways can you choose unordered tuples? Functional Equation: Asymptotically:

Sketch of the problems: De-rooting If the root is removed, trees that are different when the root is known, can become identical. De-rooting T1T1 T2T2 TkTk Set of dissimilar nodes (4) Symmetry edge Set of dissimilar edges (3) Node classes – edges classes [ignoring symmetry edge] = 1 for any unlabelled, unrooted tree

Counting rooted unordered bifurcating trees Recursive argument: TnTn T n-k TkTk Each choice of Q from and P from will create new T except when k=n-k. T 1 = 1 if n odd if n even n/2 Functional Equation: Asymptotically:

Numbers of tree shapes from Felsenstein, 2003 Rooted bifurcatingRooted multifurcatingUnrooted bifurcatingUnrooted multifurcating Number of Leaves

Pruefer Code: Number of Spanning trees on labeled nodes Aigner & Ziegler “Proofs from the Book” chapt. “Cayley’s formula for the number of trees” Springer + van Lint & Wilson (1992) “A Course in Combinatorics” chapt. 2 “Trees” k k k-2 ? From tree to tuple: From tuple to tree: Proof by Bijection to k-2 tuples of [1,..,k] (Pruefer1918): From van Lint and Wilson Remove leaf with lowest index b i Register attachment of leaf a i Given a 1,..,a n-2, set a n-1 = n Let b i be smallest {a i,a i+1,., a n+1 } U {b 1,b 2,..,b i-1 } Then [{b i,a i }:i=1,..,n-1] will be the edge set of the spanning tree

Heuristic Searches in Tree Space Nearest Neighbour Interchange Subtree regrafting Subtree rerooting and regrafting T2T2 T1T1 T4T4 T3T3 T2T2 T1T1 T4T4 T3T3 T2T2 T1T1 T4T4 T3T3 T4T4 T3T3 s4 s5 s6 s1 s2 s3 T4T4 T3T3 s4 s5 s6 s1 s2 s3 T4T4 T3T3 s4 s5 s6 s1 s2 s3 T4T4 T3T3 s4 s5 s6 s1 s2 s3

Counting Pedigrees 1 extant individual, discrete generations, ancestors sex-labelled?:

Counting Sex-Labelled Pedigrees Tong Chen & Rune Lyngsø k i’j’ k-1 ij 0 1 A k (i,j) - the number of pedigrees k generations back with i females, k males * * * * * * * Recursion: S(n,m) - Stirling numbers of second kind - ways to partition n labeled objects into m unlabelled groups.

This and next 2 lectures The number of trees Operations on Trees Metrics on Trees Averages/Consensus of Trees Counting other genealogical structures Trees and Supertrees October 28 th : Principles of Phylogeny Reconstruction October 29 th : Results from Phylogenetic Analysis November 4 th : The Ancestral Recombination Graph and Pedigrees