Presentation is loading. Please wait.

Presentation is loading. Please wait.

RECOMB Satellite Workshop, 2007 Algorithms for Association Mapping of Complex Diseases With Ancestral Recombination Graphs Yufeng Wu UC Davis.

Similar presentations


Presentation on theme: "RECOMB Satellite Workshop, 2007 Algorithms for Association Mapping of Complex Diseases With Ancestral Recombination Graphs Yufeng Wu UC Davis."— Presentation transcript:

1 RECOMB Satellite Workshop, 2007 Algorithms for Association Mapping of Complex Diseases With Ancestral Recombination Graphs Yufeng Wu UC Davis

2 2 Association (or LD) Mapping Given a subset of SNPs from unrelated individuals, find unobserved genetic variations that strongly discriminate individuals with the trait (cases) and those without the trait (controls) Complex Diseases: difficult to map

3 3 Illustration (Zollner and Pritchard, Genetics, 2005) Cases Controls SNP markers 1: 001101 2: 110000 3: 001110 4: 001000 5: 000010 6: 111101 7: 100011 8: 110001 9: 110010 10: 100011 11: 010000 12: 101101

4 4 Some Challenges in Association Mapping 1 2

5 5 The Genealogy Approach “..the best information that we could possibly get about association is to know the full coalescent genealogy…” – Zollner and Pritchard Goal: infer genealogy from marker data with recombination –Approximation (e.g. in Zollner and Pritchard)

6 6 Ancestral Recombination Graph (ARG) 100100 S1 = 00 S2 = 01 S3 = 10 S4 = 10 Mutations S1 = 00 S2 = 01 S3 = 10 S4 = 11 10010011 Recombination Assumption: at most one mutation per site 1 00 1 1

7 7 Full-ARG Approaches First full ARG mapping method (Minichiello and Durbin) –Use full plausible ARG, but heuristic –Less complex disease model Our results (Wu, 2007) –Sampling full ARGs with provable property, and work on more complex disease model –Focus on parsimonious history minARGs: ARGs that use the minimum number of recombinations Near minimum ARGs –Uniform sampling of minARGs

8 8 Special Case: ARG with Only Input Sequences Self-derivability (SD) Problem: construct an ARG with only the input sequences In fact, such ARG, if exits, must be a minARG Runs in O(2 n ) time Heuristics to extend to non-self- derivable data

9 9 00000 01000 01100 01101 11000 00010 11011 00011 1 2 00000 01000 01100 01101 11000 00010 00011 11011 N1=164 00000 01000 01100 11000 00010 11011 00011 01101 N2=76 N = 164*1 + 76*2 = 316 Counting Self-derived ARGs

10 00000 01000 01100 01101 11000 00010 11011 00011 1 2 00000 01000 01100 01101 11000 00010 00011 11011 164 00000 01000 01100 11000 00010 11011 00011 01101 76 1. Random value Rnd = 0.3 < 0.52 316 Select 11011 with prob = 164/316 = 0.52, and 01101 with prob = 76*2/316 = 0.48 2. Pick seq = 11011 as last row to derive 3. Move to reduced matrix

11 11 ARGs Represents a Set of Marginal Trees Clear separation of cases/controls: NOT expected for complex diseases!

12 12 Disease Model (Zollner & Pritchard) Disease mutations: Poisson Process Two alleles: wild- type and mutant 0.05 0.1 0.05

13 13 Disease Penetrance (Zollner & Pritchard) P A,1 : probability of a mutant sequence becomes a case P C,1 = 1.0 - P A,1 P A,0 : probability of a wild- type sequence becomes a case P C,0 = 1.0 - P A,0 0.05 0.1 0.05 Case Control

14 14 Phenotype Likelihood (Zollner and Pritchard) Given a tree T x at position x and case/control phenotype  of its leaves, what is the probability Pr(  | T x ) of observing  on T x ? (Zollner & Pritchard) –Sum over all subset of mutated edges Adopted in this work

15 15 Expected Phenotype Likelihood Need for assessing statistical significance. Null model: randomly permute case/control labels. Our result: O(n 3 ) algorithm for computing expected value of phenotype likelihood. –Exact, fully deterministic method.

16 16 Diploid Penetrance Diploid: two sequences per individual Diploid enetrance: P A,00 : prob. Individual with two wild- type sequences becomes a case P A,01 : …, P A,11 : … Case Control Efficient computation of phenotype likelihood: stated but unresolved in Zollner and Pritchard Our result (Wu, 2007): computing phenotype likelihood with diploid penetrance is NP-hard

17 17 Simulation Results Comparison: TMARG (uniform), TMARG (pathway), LATAG, MARGARITA

18 18 Acknowledgement Software available at: http://wwwcsif.cs.ucdavis.edu/~wuyu I want to thank –Dan Gusfield –Dan Brown –Chuck Langley –Yun S. Song


Download ppt "RECOMB Satellite Workshop, 2007 Algorithms for Association Mapping of Complex Diseases With Ancestral Recombination Graphs Yufeng Wu UC Davis."

Similar presentations


Ads by Google