Download presentation
Presentation is loading. Please wait.
1
Tutorial #6 by Ma’ayan Fishelson Based on notes by Terry Speed
2
Background – Lander & Green’s HMM Complexity: –Linear in the number of loci, and number of founders. –Exponential in the number of non-founders. Recombinations across successive intervals are independent sequential computation across loci using the forward-backward algorithm is enabled. The algorithm computing the probability of the data given an inheritance vector is linear in the number of founders. We need to sum over all possible inheritance vectors (exponential in the number of non-founders).
3
Goal Compute Pr[m l | v l ], at locus l. marker data at this locus (evidence). A certain inheritance vector.
4
References The algorithm presented herein was introduced by Sobel and Lange [2], and Kruglyak et al. [1]. 1.E. Sobel and K. Lange. Descent graphs in pedigree analysis: applications to haplotyping, location score, and marker-sharing statistics. Am. J. Hum. Genet., 58:1323--1337. 1996. 2.L. Kruglyak, M.J. Daly, M.P. Reeve-Daly, and E.S. Lander. Parametric and nonparametric linkage analysis: a unified multipoint approach. Am. J. Hum. Genet., 58:1347--1363, 1996.
5
Main Idea Let a = (a 1,…,a 2f ) be a vector of alleles assigned to founders of the pedigree (f is the number of founders). We want to represent by a graph the restrictions imposed by the observed marker genotypes on the vectors a that can be assigned to the founder genes. The algorithm extracts from the graph only vectors a compatible with the marker data. Pr[m|v] is obtained via a sum over all compatible vectors a.
6
Example – marker data on a pedigree 12 1211 a/b 21 13 22 14 a/b 2324 b/d a/c
7
Descent Graph Corresponds to a specific inheritance vector. Vertices: the individuals’ genes (2 genes for each individual in the pedigree). Edges: represent the gene flow specified by the inheritance vector. A child’s gene is connected by an edge to the parent’s gene from which it flowed.
8
Example – Descent Graph (vertices) 12 1211 a/b 21 13 22 14 a/b 2324 b/d a/c 3456 1278 (a,b) (a,c)(b,d)(a,b) Descent Graph Assume that the descent graph vertices below represent the pedigree on the left.
9
Example – Descent Graph (cont.) 3456 1278 (a,b) (a,c)(b,d)(a,b) Descent Graph 1. Assume that paternally inherited genes are on the left. 2. Assume that non-founders are placed in increasing order. 3.A ‘1’ (‘0’) is used to denote a paternally (maternally) originated gene. The gene flow above corresponds to the inheritance vector: v = ( 1,1; 0,0; 1,1; 1,1; 1,1; 0,0 )
10
Founder Graph Vertices: the founder genes. Edges: connect the genes appearing together in a genotyped individual for the gene flow specified by the inheritance vector v. Note: the edges are labeled with the genotype of the corresponding individuals.
11
Example – Founder Graph 53 21 64 87 (b,d) (a,b) (a,c) (a,b) Founder Graph 3456 1278 (a,b) (a,c)(b,d)(a,b) Descent Graph
12
Founder Graph Includes m connected components, C 1,…C m. The founder genes assigned to different components appear in different genotyped individuals, by construction. Under random mating and Hardy-Weinberg equilibrium, the vectors of alleles assigned to different components are independent Each component can be processed individually.
13
Singleton Components The vertices corresponding to genes that never passed through genotyped individuals form singleton components. Any allele type can be assigned to singleton components. 53 21 64 87 (b,d) (a,b) (a,c) (a,b) Singleton component
14
Singleton Components (cont.) 3456 1278 (a,b) (a,c)(b,d)(a,b)
15
Find compatible allelic assignments for non-singleton components 1.Identify the set of compatible alleles for each vertex. This is the intersection of the genotypes. attached to the edges incident to the vertex. 53 21 64 87 (b,d) (a,b) (a,c) (a,b) {a,b} ∩ {a,b} = {a,b}{a,b} ∩ {b,d} = {b}
16
Find compatible allelic assignments for non-singleton components (cont.) 2.Utilize the whole structure of the component to find allelic assignments compatible with observed genotypes for the component. I.Pick an arbitrary vertex in the component. II.If the set of compatible alleles for that vertex contains one element select that allele type. Otherwise, repeat step III for each of the 2 allele types. III.Traverse the graph & record the alleles assigned to each vertex to obtain a compatible allelic assignment (when selecting one allele type, the allele types of the adjacent vertices are determined…). IV.If an incompatibility is encountered at some point there’s no compatible assignment for the allele type we started from.
17
Possible Allelic Assignments (example) 53 21 64 87 (b,d) (a,b) (a,c) (a,b) {a,b} {a,c} {a} {b} {b,d} {a,b,c,d} Allelic AssignmentsGraph Component (a), (b), (c), (d)(2) (a,b,a), (b,a,b)(1,3,5) (a,b,c,d)(4,6,7,8)
18
Compatible Allelic Assignments Denote by A 1,…,A m the set of compatible allelic assignments obtained for each connected component at the end of the algorithm. Except for singleton components, each A i contains 0,1, or 2 assignments. If for some i, A i is empty Pr[m|v] = 0. The compatible assignments are those in the Cartesian product A 1 x…xA m.
19
Computing Pr[m|v] The probability of singleton components is 1 we can ignore them. Let a hi be an element of A i (a vector of alleles assigned to the vertices of component C i ).
20
Computing Pr[m|v] – Complexity The summation contains at most 2 terms. The product is over 2f elements. The maximum number of operations is 4f. The computation scales linearly with the no. of founders.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.