Comp Administrative Details Course Overview Coalescent Theory? 7/15/20151Comp 790– Introduction & Coalescence Genetics, Evolution, and the Coalescent Theory
Course Overview Synopsis – Graduate-level project course – Guided reading, discussions, and project with write-up Website – To appear at: Course Grading – Class Participation10% – 2 In-class Presentations40% – Final Project, Presentation, & Write-up50% 7/15/20152Comp 790– Introduction & Coalescence
Syllabus ⅓ Guided reading/discussion of text ⅓ Student presentations of recent papers ⅓ Project Proposals 7/15/2015Comp 790– Introduction & Coalescence3
Coalescent Theory Ancestral properties can be inferred from extant populations Alternatives to Correctness – Most-Likely – Most Parsimonious – Other optimality criteria Background – Biology (genetics) – Statistics – Computational Modeling 7/15/2015Comp 790– Introduction & Coalescence4
Non-Classical Genetics Coalescence differs from classical genetics – Analysis rather than synthesis – Depends on models, which attempt to explain observations – Less emphasis on Darwin’s natural selection Considers population dynamics – Isolation – Bottlenecks 7/15/2015Comp 790– Introduction & Coalescence5
Historical Human Migrations 7/15/2015Comp 790– Introduction & Coalescence6
Population Dynamics It is helpful to view evolutionary trees in the contexts of geography and population structure These factors affect the prevalence and distribution of genes Genetic diversity largely depends on population isolation and population bottlenecks, as well as – Constant population size (resource limited) – Sudden increases in population (explosions) – Patterns of growth (exponential, uniform, etc.) 7/15/2015Comp 665 – Introduction & Signals7
It’s About Genes Genetics is most clearly understood by considering its subject to be genes rather than organisms Organisms are merely vessels for assuring the survival of genes Successful genes live on long after their host organism An objective of a gene is to replicate itself 7/15/2015Comp 790– Introduction & Coalescence8 “[Genes] that survived were the ones that built survival machines for themselves to live in. But making a living got steadily harder as new rivals arose with better and more effective survivial machines. Survival machines got bigger and more eloborate, and the process was cumulative and progressive…” -- Dawkins, The Selfish Gene
Why Computer Science Classically, genetics, both generative (classical) and coalescent (population) has focused on mathematical/statistical models As model complexity increases, it becomes harder to find closed-form solutions Relies more and more on computational modeling to ascertain structure Also, complicated models often lead to common models… today’s subject 7/15/2015Comp 790– Introduction & Coalescence9
Wright-Fisher Model One of the first, and simplest models of population genealogies was introduced by Wright (1931) and Fisher (1930). Model emphasizes transmission of genes from one generation to the next For simplicity we’ll first focus on a fixed population size, each with a distinct gene variant 7/15/2015Comp 790– Introduction & Coalescence10
Simple Haploid Model Rules – Antecedent genes are chosen randomly, with replacement, from their parental generation – No selection – Fixed population size 7/15/2015Comp 790– Introduction & Coalescence11 G0: ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'] G1: ['J', 'A', 'H', 'B', 'I', 'E', 'D', 'G', 'A', 'B'] G2: ['A', 'J', 'E', 'G', 'D', 'E', 'B', 'I', 'A', 'A'] G3: ['A', 'A', 'E', 'J', 'I', 'A', 'I', 'A', 'J', 'B'] G4: ['E', 'A', 'B', 'B', 'A', 'E', 'A', 'A', 'A', 'A'] G5: ['A', 'A', 'B', 'A', 'A', 'E', 'A', 'A', 'A', 'B'] G6: ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'A', 'A'] G7: ['B', 'A', 'A', 'B', 'A', 'A', 'A', 'A', 'A', 'A'] What will this population eventually look like?
Assumptions of Wright/Fisher Discrete and non-overlapping generations Haploid individuals Populations size is constant All individuals are equally fit No population of social structure Genes segregate independently 7/15/2015Comp 790– Introduction & Coalescence12
Some Graphical Abstractions Replace letters with colors Draw lineages Sort topologically 7/15/201513Comp 790– Introduction & Coalescence
Repeats 7/15/2015Comp 790– Introduction & Coalescence14 Every population results in just one gene
Onset of Uniformity trials Mode = 11 (616) Mean = /15/2015Comp 790– Introduction & Coalescence15
Diploid Model Our model is obviously too simple, let’s add more realism Organisms are diploid (have 2, perhaps different, copies of each gene) Sexual reproduction Half female, Half Male 7/15/2015Comp 790– Introduction & Coalescence16 ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['CF', 'AG', 'BI', 'CH', 'CH', 'EG', 'EG', 'EH', 'EI', 'DG'] ['CE', 'IE', 'CG', 'HE', 'HE', 'FG', 'IG', 'BE', 'BG', 'HE'] ['CB', 'EI', 'HE', 'HI', 'HE', 'CG', 'EF', 'EI', 'HF', 'GI'] ['EE', 'HF', 'BC', 'IG', 'BH', 'HI', 'BI', 'EF', 'HG', 'HC'] ['CH', 'BH', 'BH', 'EI', 'BH', 'BE', 'BI', 'HC', 'EH', 'IH'] ['BE', 'BE', 'IB', 'BE', 'BH', 'BH', 'BH', 'HH', 'CB', 'EH'] ['BH', 'EC', 'BB', 'BB', 'IC', 'BH', 'EH', 'EH', 'BE', 'EB'] ['CE', 'BE', 'BE', 'BE', 'CH', 'BE', 'CB', 'HE', 'IB', 'CB'] ['EH', 'BB', 'BB', 'BH', 'EC', 'EE', 'EC', 'CE', 'CI', 'CE'] ['CI', 'BE', 'EC', 'BE', 'BC', 'BC', 'BE', 'BE', 'BE', 'BC'] ['EB', 'EB', 'BB', 'CB', 'EB', 'CC', 'IB', 'BC', 'IE', 'CB'] ['BB', 'EC', 'BE', 'BC', 'CC', 'CC', 'EI', 'BI', 'BC', 'BC'] ['EB', 'CB', 'CC', 'BC', 'BB', 'CC', 'BI', 'CC', 'BC', 'CC'] ['BC', 'BC', 'CC', 'BC', 'BC', 'EC', 'BB', 'BC', 'BC', 'BB'] ['BB', 'BC', 'CB', 'CC', 'CB', 'CC', 'CB', 'CB', 'CE', 'CB'] ['CC', 'CB', 'CC', 'CB', 'BB', 'CC', 'CB', 'CC', 'CC', 'BC'] ['CC', 'CC', 'CC', 'BB', 'CC', 'BC', 'CC', 'CC', 'CC', 'CB'] ['CC', 'BC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CB', 'CC', 'CC'] ['CC', 'BC', 'BC', 'CC', 'CC', 'CC', 'CC', 'CC', 'BC', 'CC'] ['CC', 'CC', 'CC', 'BC', 'CC', 'CC', 'CC', 'CC', 'BC', 'CB'] ['CC', 'BC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'] ['CC', 'CC', 'CC', 'CC', 'CC', 'BC', 'CC', 'CC', 'CC', 'CC'] ['CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC', 'CC'] MalesFemales
Same Result 7/15/2015Comp 790– Introduction & Coalescence17 ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['AG', 'BF', 'AI', 'EJ', 'DH', 'AH', 'CH', 'AI', 'DI', 'AH'] ['BI', 'AI', 'FA', 'AC', 'GH', 'AH', 'BI', 'DH', 'FI', 'AH'] ['HH', 'AI', 'BH', 'II', 'AI', 'AI', 'GI', 'AD', 'IA', 'AA'] ['HG', 'IG', 'ID', 'IA', 'ID', 'HG', 'HI', 'AA', 'AI', 'IA'] ['IA', 'DG', 'DH', 'HA', 'GA', 'DI', 'GH', 'GG', 'II', 'IH'] ['HI', 'GI', 'GH', 'DH', 'GG', 'HH', 'II', 'HI', 'ID', 'GH'] ['HH', 'GI', 'HI', 'HI', 'HH', 'IH', 'HH', 'HI', 'GG', 'HH'] ['GH', 'HH', 'HH', 'HH', 'IH', 'II', 'HH', 'HI', 'HG', 'HH'] ['HI', 'HG', 'HH', 'HH', 'IH', 'HH', 'GH', 'HH', 'HI', 'HH'] ['HH', 'HH', 'HH', 'HI', 'HG', 'HH', 'HH', 'HH', 'HH', 'IH'] ['HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'IH', 'GH', 'HH', 'GH'] ['HI', 'HH', 'HG', 'HG', 'HH', 'HI', 'HG', 'HG', 'HH', 'HH'] ['GG', 'GH', 'HI', 'HH', 'IG', 'GH', 'IG', 'HG', 'HH', 'HH'] ['IH', 'HH', 'HG', 'HI', 'GH', 'HH', 'GI', 'IG', 'HH', 'GH'] ['IH', 'HG', 'HG', 'IH', 'HH', 'HG', 'II', 'HH', 'HG', 'IH'] ['II', 'HI', 'HI', 'HH', 'HH', 'HH', 'IH', 'HH', 'GI', 'HH'] ['IH', 'II', 'HH', 'HH', 'II', 'HH', 'HH', 'IH', 'HH', 'HH'] ['HH', 'HH', 'HH', 'II', 'IH', 'IH', 'IH', 'HH', 'IH', 'IH'] ['IH', 'HH', 'IH', 'II', 'IH', 'HH', 'HH', 'II', 'IH', 'IH'] ['HI', 'II', 'IH', 'IH', 'IH', 'HH', 'HI', 'IH', 'HI', 'HH'] ['IH', 'HH', 'HI', 'HI', 'HH', 'HI', 'HH', 'II', 'IH', 'HH'] ['II', 'HH', 'HH', 'HH', 'HH', 'HH', 'HI', 'HH', 'HH', 'II'] ['IH', 'HI', 'HI', 'HH', 'HI', 'HH', 'HH', 'HH', 'IH', 'HI'] ['IH', 'HH', 'HI', 'IH', 'HH', 'HH', 'HI', 'HI', 'HI', 'HH'] ['HI', 'HH', 'HH', 'HH', 'HH', 'HH', 'HI', 'HI', 'HH', 'HH'] ['HH', 'HH', 'IH', 'HH', 'HI', 'HH', 'HH', 'HH', 'HH', 'HH'] ['HH', 'HH', 'HH', 'IH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH'] ['HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH', 'HH'] MalesFemales ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['EF', 'EI', 'DI', 'BF', 'EJ', 'BI', 'AI', 'CI', 'EJ', 'EI'] ['FB', 'EE', 'EI', 'JB', 'FE', 'BJ', 'JE', 'EI', 'EE', 'DI'] ['FJ', 'IE', 'FE', 'JE', 'EJ', 'FI', 'FE', 'BJ', 'EB', 'FJ'] ['FF', 'II', 'EB', 'FI', 'EI', 'EJ', 'EB', 'IB', 'EI', 'EF'] ['FE', 'IE', 'BE', 'BB', 'BE', 'EB', 'EB', 'IE', 'BI', 'FJ'] ['BB', 'FE', 'FE', 'EE', 'EE', 'BJ', 'II', 'EE', 'BE', 'FE'] ['BB', 'EB', 'EB', 'EE', 'EI', 'EE', 'FI', 'EE', 'EE', 'EE'] ['EF', 'BE', 'BE', 'EE', 'BE', 'IE', 'BI', 'EE', 'EE', 'EE'] ['EE', 'BE', 'FE', 'EB', 'EE', 'BE', 'BE', 'BE', 'FE', 'EE'] ['EF', 'EE', 'EE', 'EE', 'EE', 'EE', 'FE', 'EB', 'BE', 'EE'] ['EE', 'FB', 'EE', 'EE', 'EE', 'EB', 'EE', 'EE', 'EE', 'EE'] ['EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'BE', 'EE', 'EE'] ['EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE', 'EE'] ['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ'] ['AF', 'AJ', 'AF', 'CG', 'BG', 'BH', 'AH', 'EG', 'DG', 'DJ'] ['AA', 'AH', 'CA', 'AJ', 'BJ', 'GB', 'AB', 'GH', 'BB', 'AA'] ['JH', 'AB', 'AA', 'AG', 'AB', 'JB', 'AH', 'AA', 'AA', 'AG'] ['GA', 'AA', 'AG', 'BH', 'AB', 'BA', 'GA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AG', 'AA', 'AA', 'AA', 'AB', 'AA'] ['AA', 'AA', 'AB', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['BA', 'AA', 'AA', 'AA', 'AA', 'BA', 'AA', 'AA', 'AA', 'AA'] ['AB', 'AA', 'AA', 'AB', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'BA', 'AA', 'BA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'AA', 'BA', 'BA', 'AA', 'AA', 'AA', 'AA', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'BA', 'AA', 'BA', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AB', 'AB', 'AA'] ['AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA', 'AA'] MalesFemales MalesFemales
Similar Distributions trials Mode = 24 (291) Mean = 35.5 These statistics are almost exactly 2x the haploid (Mode = 11, Mean = 17.5) 7/15/2015Comp 790– Introduction & Coalescence18
Punnett Squares AA x BBAA BAB B 7/15/2015Comp 790– Introduction & Coalescence19 AB x ABAB AAAAB B BB What if we introduce inbreeding, by choosing mates from common litters, in successive generations AB x BBAB BABBB BABBB
Different Problem? 7/15/2015Comp 790– Introduction & Coalescence20 ('A', 'A') ('B', 'B') [('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('B', 'B') [('A', 'B'), ('A', 'B'), ('B', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('B', 'B') [('A', 'B'), ('A', 'B'), ('B', 'B'), ('B', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'A') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'A'), ('A', 'B')] ('A', 'A') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'A'), ('A', 'A')] ('A', 'A') ('B', 'B') [('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'B') ('B', 'B') [('A', 'B'), ('A', 'B'), ('B', 'B'), ('B', 'B')] ('B', 'B') ('B', 'B') [('B', 'B'), ('B', 'B'), ('B', 'B'), ('B', 'B')] ('A', 'A') ('B', 'B') [('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B')] ('A', 'B') ('A', 'B') [('A', 'A'), ('A', 'B'), ('A', 'B'), ('B', 'B')] ('A', 'A') ('A', 'A') [('A', 'A'), ('A', 'A'), ('A', 'A'), ('A', 'A')] Parents Offspring
Distributions 7/15/2015Comp 790– Introduction & Coalescence21
Next time Commonly occurring distributions – Geometric – Exponential 7/15/2015Comp 790– Introduction & Coalescence22