Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible.

Similar presentations


Presentation on theme: "Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible."— Presentation transcript:

1 Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible combinations that will be tested –Expect  probability of false positives Significant probability value - p ij –Example p ij ≤ 0.00001

2 Locus ordering Ideally, we would estimate the likelihoods for all possible orders and take the one that is most probable by comparing log likelihoods That is computationally inefficient when there are more than ~10 loci Several methods have been proposed for producing a preliminary order

3 Locus ordering No. of loci k Possible orders No. of triplets 210 331 56010 1,814,400120 201.22 X 10 18 1,140 404.08 X 10 47 9,880 Number of orders among k loci Number of triplets among k loci

4 Three-point Analysis Number of unique orders among k loci OrderMirror Order ABCCBA ACBBCA BACCAB For three loci (k = 3 )

5 Three-point analysis

6

7 Non-Additivity of recombination frequencies A B C r AB r BC r AC The recombination frequency over the interval A – C (r AC ) is less than the sum of r AB and r BC : r AC < r AB + r BC. This is because (rare) double recombination events (a recombination in both A - B and B - C) do not contribute to recombination between A and C.

8 Non-Additivity of recombination frequencies A B C A B C A B C A B C P 00 =(1-r AB )(1-r BC ) P 10 =r AB (1-r BC ) P 01 =(1-r AB )r BC P 11 =r AB r BC r AC =r AB (1-r BC )+(1-r AB )r BC r AC =r AB +r BC -2r AB r BC

9 Interference means that recombination events in adjacent intervals interfere. The occurrence of an event in a given interval may reduce or enhance the occurrence of an event in its neighbourhood. Positive interference refers to the ‘suppression’ of recombination events in the neighbourhood of a given one. Negative interference refers to the opposite: enhancement of clusters of recombination events. Positive interference results in less double recombinants (over adjacent intervals) than expected on the basis of independence of recombination events. Interference r AC =r AB +r BC -2Cr AB r BC

10 Interference C = coefficient of coincidence A BC a bc Interference I = 1 - C Coefficient of coincidence Expected number of double crossovers = r AB r BC N

11 Observed Count: 22 24 16 14 8 10 2 4 DH population N=100, locus order ABC

12 Interference No interference –C = 1 and Interference = 1-C = 0 Complete interference –C = 0 and Interference = 1-C = 1 Negative interference –C > 1 and Interference = 1-C < 0 Positive interference –C 0

13 Three locus analysis, DH population Expected frequency GenotypesObserved count Without interferenceWith interference ABC/ABCf1f1  r   r   r   r   Cr  r  ) ABc/ABcf2f2  r   r   r   Cr  r  ) AbC/AbCf3f3  r  r   Cr  r  Abc/Abcf4f4  r   r   r   Cr  r  ) aBC/aBCf5f5  r   r   r   Cr  r  ) aBc/aBcf6f6  r  r   Cr  r  abC/abCf7f7  r   r   r   Cr  r  ) abc/abcf8f8  r   r   r   r   Cr  r  ) NR DC 12 SC 2 SC 1 For the ABC locus order

14 MLE of two-locus recombination fractions GenotypesObserved count Expected frequency ABC/ABCf 1 = 34  r   r   Cr  r  ) ABc/ABcf 2 = 5  r   Cr  r  ) AbC/AbCf 3 = 11  Cr  r  Abc/Abcf 4 = 0  r   Cr  r  ) aBC/aBCf 5 = 1  r   Cr  r  ) aBc/aBcf 6 = 10  Cr  r  abC/abCf 7 = 4  r   Cr  r  ) abc/abcf 8 = 35  r   r   Cr  r  ) Regardless of locus order the MLEs of r  are For the ABC locus order

15 Ordering Loci by Minimizing Double Crossovers GenotypesObserved count ABC/ABCf 1 = 34 ABc/ABcf 2 = 5 AbC/AbCf 3 = 11 Abc/Abcf 4 = 0 aBC/aBCf 5 = 1 aBc/aBcf 6 = 10 abC/abCf 7 = 4 abc/abcf 8 = 35 GenotypesObserved count ABC + abcf 1 + f 8 = 34 + 35 = 69 ABc + abCf 2 + f 7 = 5 + 4 = 9 AbC + aBcf 3 + f 6 = 11 + 10 = 21 Abc + aBCf 4 + f 5 = 0 + 1 = 1 Rarest genotypes are double recombinants BAC bac XX BaC bAc The order of loci is BAC

16 Ordering Loci by using recombination fractions MLEs of r  are Largest r is r BC = 0.3 Smallest r is r AC = 0.1 B C A C B A C Order

17 Minimum Sum of Adjacent Recombination Frequencies (SARF) (Falk 1989) OrderSARF ABC0.22 + 0.30 = 0.52 BAC0.22 + 0.10 = 0.32 ACB0.10 + 0.30 = 0.40 r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[SARF] and the minimum distance (MD) map Simulations have shown that SARF is a reliable method to obtain markers orders for large datasets

18 Minimum Product of Adjacent Recombination Frequencies (PARF) (Wilson 1988) OrderPARF ABC0.22 x 0.30 = 0.066 BAC0.22 x 0.10 = 0.022 ACB0.10 x 0.30 = 0.030 r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[PARF] and the minimum distance (MD) map SARF and PARF are equivalent methods to obtain markers orders for large datasets

19 Maximum Sum of Adjacent LOD Scores (SALOD) OrderSALOD ABC3.135 + 1.551 = 4.686 BAC3.135 + 6.942 = 10.077 ACB6.942 + 1.551 = 8.493  = LOD score for recombination frequency between adjacent loci a i and a j for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MAX[SALOD] SALOD is sensitive to locus informativeness

20 Minimum Count of Crossover Events (COUNT) (Van Os et al. 2005) OrderCOUNT ABC22 + 30 = 52 BAC22 + 10 = 32 ACB10 + 30 = 40 X = simple count of recombination events between adjacent loci a i and a j for a given sequence: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[COUNT] COUNT is equivalent to SARF and PARF with perfect data. COUNT is superior to SARF with incomplete data

21 Locus Order- Likelihood Approach r  = Recombination fraction in interval 1 r   = Recombination fraction in interval 2 C = Coefficient of coincidence p i = f i / n f i = Expected frequency of the i th pooled phenotypic class I = 1, 2, …, k k = No. of pooled phenotypic classes

22 Three locus analysis, DH population Expected frequency GenotypesObserved count Without interferenceWith interference ABC/ABCf1f1  r   r   r   r   Cr  r  ) ABc/ABcf2f2  r   r   r   Cr  r  ) AbC/AbCf3f3  r  r   Cr  r  Abc/Abcf4f4  r   r   r   Cr  r  ) aBC/aBCf5f5  r   r   r   Cr  r  ) aBc/aBcf6f6  r  r   Cr  r  abC/abCf7f7  r   r   r   Cr  r  ) abc/abcf8f8  r   r   r   r   Cr  r  ) NR DC 12 SC 2 SC 1 For the ABC locus order

23 MLE of two-locus recombination fractions GenotypesObserved count Expected frequency ABC/ABCf 1 = 34  r   r   Cr  r  ) ABc/ABcf 2 = 5  r   Cr  r  ) AbC/AbCf 3 = 11  Cr  r  Abc/Abcf 4 = 0  r   Cr  r  ) aBC/aBCf 5 = 1  r   Cr  r  ) aBc/aBcf 6 = 10  Cr  r  abC/abCf 7 = 4  r   Cr  r  ) abc/abcf 8 = 35  r   r   Cr  r  ) Regardless of locus order the MLEs of r  are For the ABC locus order

24 HaplotypesObs. No.Freq. C=3.00Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 = 690.69  r   r   Cr  r  1-0.10-0.3=0.60 1-0.10-0.30+0.03=0.63 ABc + abCf 2 = 90.09 CrrCrr 0.00 0.03 AbC + aBcf 3 = 210.21 rCrrrCrr 0.30 0.30-0.03=0.27 Abc + aBCf 4 = 10.01 rCrrrCrr 0.10 0.10-0.03=0.07 HaplotypesObs. No.Freq. C=3.18Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 = 690.69  r   r   Cr  r  1-0.22-0.30=0.48 1-0.22-0.30+0.066=0.546 ABc + abCf 2 = 90.09 rCrrrCrr 0.30 0.30-0.066=0.234 AbC + aBcf 3 = 210.21 CrrCrr 0.00 0.066 Abc + aBCf 4 = 10.01 rCrrrCrr 0.22 0.22-0.066=0.154 HaplotypesObs. No.Freq. C=0.45Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 = 690.69  r   r   Cr  r  1-0.22-0.10=0.68 1-0.22-0.10+0.022=0.702 ABc + abCf 2 = 90.09 rCrrrCrr 0.10 0.10-0.022=0.078 AbC + aBcf 3 = 210.21 rCrrrCrr 0.22 0.22-0.022=0.198 Abc + aBCf 4 = 10.01 CrrCrr 0.00 0.022 ABC ORDER BAC ORDER ACB ORDER

25 HaplotypesObs. No.p i, C=3.18p i, C=1 ABC + abcf 1 = 690.690.546 ABc + abCf 2 = 90.090.234 AbC + aBcf 3 = 210.210.066 Abc + aBCf 4 = 10.010.154 ABC ORDER

26 HaplotypesObs. No.p i, C=0.45p i, C=1 ABC + abcf 1 = 690.690.702 ABc + abCf 2 = 90.090.078 AbC + aBcf 3 = 210.210.198 Abc + aBCf 4 = 10.010.022 BAC ORDER

27 HaplotypesObs. No.p i, C=3.00p i, C=1 ABC + abcf 1 = 690.69 0.63 ABc + abCf 2 = 90.09 0.03 AbC + aBcf 3 = 210.21 0.27 Abc + aBCf 4 = 10.01 0.07 ACB ORDER

28 Likelihood method Unconstrained ModelConstrained Model OrderCLikelihoodLOD Likelihood C=1 LOD C=1 ABC3.18-36.76423.441-49.41310.793 BAC0.45-36.76423.441-37.00123.204 ACB3.00-36.76423.441-40.64819.558 The B-A-C order gives highest likelihood and LOD under a no interference C=1 model Most multipoint ML mapping algorithms use no interference models

29 Ordering Loci GMENDEL (Liu and Knapp 1990) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) PGRI (Lu and Liu 1995) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) or maximizes the likelihood. RECORD (Van Os et al. 2005) minimizes COUNT (Minimum Count of Crossover Events)

30 Ordering Loci JoinMap 4 (Van Ooijen, 2005) –minimizes the least square locus order using a stepwise search (regression) –Monte Carlo maximum likelihood (ML). Very fast computation of high density maps


Download ppt "Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible."

Similar presentations


Ads by Google