Download presentation
Presentation is loading. Please wait.
Published byMaurice Fox Modified over 8 years ago
1
Recombination and Pedigrees Genealogies and Recombination: The ARG Recombination Parsimony The ARG and Data Pedigrees: Models and Data Pedigrees & ARGs Challenges Empirical Investigations: Open Questions
2
Recombination Histories & Global Pedigrees Acknowledgements Yun Song - Rune Lyngsø - Mike Steel Finding Minimal Recombination Histories 1 23 4 1 23 4 1 2 3 4 Global Pedigrees Finding Common Ancestors NOW
3
Hudson & Kaplan’s R M If you equate R M with expected number of recombinations, this could be used as an estimator. Unfortunately, R M is a gross underestimate of the real number of recombinations. 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1
4
Local Inference of Recombinations 00001110000111 00011010001101 00 10 01 11 Four combinations Incompatibility: Myers-Griffiths (2002): Number of Recombinations in a sample, N R, number of types, N T, number of mutations, N M obeys: 00110011 01010101 T... G T... C A... G A... C Recoding At most 1 mutation per column 0 ancestral state, 1 derived state
5
Finding Minimal Recombination Histories 1.J.J.Hein: Reconstructing the history of sequences subject to Gene Conversion and Recombination. Mathematical Biosciences. (1990) 98.185-200. 2.J.J.Hein: A Heuristic Method to Reconstruct the History of Sequences Subject to Recombination. J.Mol.Evol. 20.402-411. 1993 3.Hein,J.J., T.Jiang, L.Wang & K.Zhang (1996): "On the complexity of comparing evolutionary trees" Discrete Applied Mathematics 71.153-169 4.Song, Y.S. (2003) “On the combinatorics of rooted binary phylogenetic trees”. Annals of Combinatorics, 7:365–379 5.Song, Y.S. & Hein, J. (2005) Constructing Minimal Ancestral Recombination Graphs. J. Comp. Biol., 12:147–169Constructing Minimal Ancestral Recombination Graphs 6.Song, Y.S. & Hein, J. (2004) On the minimum number of recombination events in the evolutionary history of DNA sequences. J. Math. Biol., 48:160–186.On the minimum number of recombination events in the evolutionary history of DNA sequences. 7.Song, Y.S. & Hein, J. (2003) Parsimonious reconstruction of sequence evolution and haplotype blocks: finding the minimum number of recombination events, Lecture Notes in Bioinformatics, Proceedings of WABI'03, 2812:287–302. 8. Lyngsø, Song and Hein (2005) “Minimal Recombination Histories by Branch and Bound” WABI 64 Bodmer & Edwards: Parsimony defined as reconstruction principle 85 Hudson Kaplan uses minimal recombination histories as observed recombinations Attempts to find minimal histories of sequences Definition of recombination as Subtree Prune Regraft operations
6
Minimal Number of Recombinations Last Local Tree Algorithm: L 21 Data 2 n i-1i 1 Trees The Kreitman data (1983): 11 sequences, 3200bp, 43(28) recoded, 9 different How many neighbors? Bi-partitions How many local trees? Unrooted Coalescent
7
Two Adjacent Columns 2. Infinite Site Assumption: Local Trees must contain Local Bipartition 1. RecDist[T 1,T 2 ] is hard for large leaf number, but can be automatically calculated by adding Diam Rec trivial columns and only considering 1 recombination neighbors. 110011110110011110 000011111000011111 1 2 Diam Rec RecDist
8
Metrics on Trees based on subtree transfers. Pretending the easy problem (unrooted) is the real problem (age ordered), causes violation of the triangle inequality: Tree topologies with age ordered internal nodes Rooted tree topologies Unrooted tree topologies Trees including branch lengths
9
Observe that the size of the unit-neighbourhood of a tree does not grow nearly as fast as the number of trees Song (2003+) Due to Yun Song Tree Combinatorics and Neighborhoods Allen & Steel (2001)
10
1 2 3 4 5 6 7
11
Methods # of rec events obtained Hudson & Kaplan (1985)5 Myers & Griffiths (2003)6 Song & Hein (2004). Set theory based approach.7 Song & Hein (2003). Tree scanning using DP Lyngsø, Song & Hein (2006). Massive Acceleration using Branch and Bound Algorithm. Lyngsø, Song & Hein (2006). Minimal number of Gene Conversions (in prep.) 7 5-2/6-1 The Minimal Recombination History for the Kreitman Data
12
- recombination 27 ACs 0 1 2 3 4 5 6 7 8 1 1 4 2 5 3 1 5 5 The Griffiths-Ethier-Tavare Recursions No recombination: Infinite Site Assumption Ancestral State Known History Graph: Recursions Exists No cycles Possible Histories without Recombination for simple data example + recombination 3*10 8 ACs
13
1st 2nd Ancestral configurations to 2 sequences with 2 segregating sites mid-point heuristic
14
Counting Recursion + k k 1 (k 2 +1)+1 padded with “-” 1k+1 Summary statistic lumping configurations k1k1 k2k2 (k 2 +1)*k 1 +1 possible ancestral columns.
15
Counting + Branch and Bound Algorithm ? Exact length Lower bound Upper Bound 0 3 1 91 2 1314 3 8618 4 30436 5 62794 6 78970 7 63049 8 32451 9 10467 10 1727 289920 k-recombinatination neighborhood k
16
Time versus Spatial: Coalescent-Recombination (Griffiths, 1981; Hudson, 1983 - Wiuf & Hein, 1999) Temporal ProcessSpatial Process ii. The trees cannot be reduced to Topologies i. The process is non-Markovian * * =
17
Elston-Stewart (1971) -Temporal Peeling Algorithm: Lander-Green (1987) - Genotype Scanning Algorithm: Mother Father Condition on parental states Recombination and mutation are Markovian Mother Father Condition on paternal/maternal inheritance Recombination and mutation are Markovian Time versus Spatial 2: Pedigrees
18
Time versus Spatial 3: Phylogenetic Alignment Spatial: Temporal: Optimisation Algorithms indels of length 1 (David Sankoff, 1973) Spatial indels of length k (Bjarne Knudsen, 2003) Temporal Statistical Alignment
19
minARGs: Recombination Events & Local Trees True ARG Reconstructed ARG 1 23 45 123 45 ((1,2),(1,2,3)) ((1,3),(1,2,3)) n=7, =10, =75 Minimal ARG True ARG 0 4 Mb Hudson- Kaplan Myers- Griiths Song-Hein n=8, =40 n=8, =15 Mutation information on only one side Mutation information on both sides
20
Likelihood Calculations on the -ARG 010 101 110 Example:
22
Reconstructing global pedigrees: Superpedigrees Steel and Hein, 2005 The gender-labeled pedigrees for all pairs, defines global pedigree k Gender-unlabeled pedigrees doesn’t!!
23
Reconstructing global pedigrees: Links and lassos Steel and Hein, 2005 Gender-labeled links and lassos determine the global pedigree.
24
All embedded phylogenies are observable Do they determine the pedigree? Genomes with and --> infinity recombination rate, mutation rate Benevolent Mutation and Recombination Process Counter example: Embedded phylogenies:
25
Summary and Future Presentation can be found at: http://mathgen.stats.ox.ac.uk/bioinformatics/ Minimal Recombination Histories Likelihood Calculations Global Pedigrees & Inferring Pedigrees from Genomes Recombination: Remove infinite site assumption Investigate MCMC algorithms Pedigrees: Data Analysis Algorithms
26
To Do Hudson Slide Neighbor trees Literature and History
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.