Download presentation
Presentation is loading. Please wait.
Published byScot Burns Modified over 9 years ago
1
Grouping loci Criteria Maximum two-point recombination fraction –Example -r ij ≤ 0.40 Minimum LOD score - Z ij –For n loci, there are n(n-1)/2 possible combinations that will be tested –Expect probability of false positives Significant probability value - p ij –Example p ij ≤ 0.00001
2
Locus ordering Ideally, we would estimate the likelihoods for all possible orders and take the one that is most probable by comparing log likelihoods That is computationally inefficient when there are more than ~10 loci Several methods have been proposed for producing a preliminary order
3
Locus ordering No. of loci k Possible orders No. of triplets 210 331 56010 1,814,400120 201.22 X 10 18 1,140 404.08 X 10 47 9,880 Number of orders among k loci Number of triplets among k loci
4
Three-point Analysis Number of unique orders among k loci OrderMirror Order ABCCBA ACBBCA BACCAB For three loci (k = 3 )
5
Three-point analysis
7
Non-Additivity of recombination frequencies A B C r AB r BC r AC The recombination frequency over the interval A – C (r AC ) is less than the sum of r AB and r BC : r AC < r AB + r BC. This is because (rare) double recombination events (a recombination in both A - B and B - C) do not contribute to recombination between A and C.
8
Non-Additivity of recombination frequencies A B C A B C A B C A B C P 00 =(1-r AB )(1-r BC ) P 10 =r AB (1-r BC ) P 01 =(1-r AB )r BC P 11 =r AB r BC r AC =r AB (1-r BC )+(1-r AB )r BC r AC =r AB +r BC -2r AB r BC
9
Interference means that recombination events in adjacent intervals interfere. The occurrence of an event in a given interval may reduce or enhance the occurrence of an event in its neighbourhood. Positive interference refers to the ‘suppression’ of recombination events in the neighbourhood of a given one. Negative interference refers to the opposite: enhancement of clusters of recombination events. Positive interference results in less double recombinants (over adjacent intervals) than expected on the basis of independence of recombination events. Interference r AC =r AB +r BC -2Cr AB r BC
10
Interference C = coefficient of coincidence A BC a bc Interference I = 1 - C Coefficient of coincidence Expected number of double crossovers = r AB r BC N
11
Observed Count: 22 24 16 14 8 10 2 4 DH population N=100, locus order ABC
12
Interference No interference –C = 1 and Interference = 1-C = 0 Complete interference –C = 0 and Interference = 1-C = 1 Negative interference –C > 1 and Interference = 1-C < 0 Positive interference –C 0
13
Three locus analysis, DH population Expected frequency GenotypesObserved count Without interferenceWith interference ABC/ABCf1f1 r r r r Cr r ) ABc/ABcf2f2 r r r Cr r ) AbC/AbCf3f3 r r Cr r Abc/Abcf4f4 r r r Cr r ) aBC/aBCf5f5 r r r Cr r ) aBc/aBcf6f6 r r Cr r abC/abCf7f7 r r r Cr r ) abc/abcf8f8 r r r r Cr r ) NR DC 12 SC 2 SC 1 For the ABC locus order
14
MLE of two-locus recombination fractions GenotypesObserved count Expected frequency ABC/ABCf 1 = 34 r r Cr r ) ABc/ABcf 2 = 5 r Cr r ) AbC/AbCf 3 = 11 Cr r Abc/Abcf 4 = 0 r Cr r ) aBC/aBCf 5 = 1 r Cr r ) aBc/aBcf 6 = 10 Cr r abC/abCf 7 = 4 r Cr r ) abc/abcf 8 = 35 r r Cr r ) Regardless of locus order the MLEs of r are For the ABC locus order
15
Ordering Loci by Minimizing Double Crossovers GenotypesObserved count ABC/ABCf 1 = 34 ABc/ABcf 2 = 5 AbC/AbCf 3 = 11 Abc/Abcf 4 = 0 aBC/aBCf 5 = 1 aBc/aBcf 6 = 10 abC/abCf 7 = 4 abc/abcf 8 = 35 GenotypesObserved count ABC + abcf 1 + f 8 = 34 + 35 = 69 ABc + abCf 2 + f 7 = 5 + 4 = 9 AbC + aBcf 3 + f 6 = 11 + 10 = 21 Abc + aBCf 4 + f 5 = 0 + 1 = 1 Rarest genotypes are double recombinants BAC bac XX BaC bAc The order of loci is BAC
16
Ordering Loci by using recombination fractions MLEs of r are Largest r is r BC = 0.3 Smallest r is r AC = 0.1 B C A C B A C Order
17
Minimum Sum of Adjacent Recombination Frequencies (SARF) (Falk 1989) OrderSARF ABC0.22 + 0.30 = 0.52 BAC0.22 + 0.10 = 0.32 ACB0.10 + 0.30 = 0.40 r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[SARF] and the minimum distance (MD) map Simulations have shown that SARF is a reliable method to obtain markers orders for large datasets
18
Minimum Product of Adjacent Recombination Frequencies (PARF) (Wilson 1988) OrderPARF ABC0.22 x 0.30 = 0.066 BAC0.22 x 0.10 = 0.022 ACB0.10 x 0.30 = 0.030 r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[PARF] and the minimum distance (MD) map SARF and PARF are equivalent methods to obtain markers orders for large datasets
19
Maximum Sum of Adjacent LOD Scores (SALOD) OrderSALOD ABC3.135 + 1.551 = 4.686 BAC3.135 + 6.942 = 10.077 ACB6.942 + 1.551 = 8.493 = LOD score for recombination frequency between adjacent loci a i and a j for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MAX[SALOD] SALOD is sensitive to locus informativeness
20
Minimum Count of Crossover Events (COUNT) (Van Os et al. 2005) OrderCOUNT ABC22 + 30 = 52 BAC22 + 10 = 32 ACB10 + 30 = 40 X = simple count of recombination events between adjacent loci a i and a j for a given sequence: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[COUNT] COUNT is equivalent to SARF and PARF with perfect data. COUNT is superior to SARF with incomplete data
21
Locus Order- Likelihood Approach r = Recombination fraction in interval 1 r = Recombination fraction in interval 2 C = Coefficient of coincidence p i = f i / n f i = Expected frequency of the i th pooled phenotypic class I = 1, 2, …, k k = No. of pooled phenotypic classes
22
Three locus analysis, DH population Expected frequency GenotypesObserved count Without interferenceWith interference ABC/ABCf1f1 r r r r Cr r ) ABc/ABcf2f2 r r r Cr r ) AbC/AbCf3f3 r r Cr r Abc/Abcf4f4 r r r Cr r ) aBC/aBCf5f5 r r r Cr r ) aBc/aBcf6f6 r r Cr r abC/abCf7f7 r r r Cr r ) abc/abcf8f8 r r r r Cr r ) NR DC 12 SC 2 SC 1 For the ABC locus order
23
MLE of two-locus recombination fractions GenotypesObserved count Expected frequency ABC/ABCf 1 = 34 r r Cr r ) ABc/ABcf 2 = 5 r Cr r ) AbC/AbCf 3 = 11 Cr r Abc/Abcf 4 = 0 r Cr r ) aBC/aBCf 5 = 1 r Cr r ) aBc/aBcf 6 = 10 Cr r abC/abCf 7 = 4 r Cr r ) abc/abcf 8 = 35 r r Cr r ) Regardless of locus order the MLEs of r are For the ABC locus order
24
HaplotypesObs. No.Freq. C=3.00Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 = 690.69 r r Cr r 1-0.10-0.3=0.60 1-0.10-0.30+0.03=0.63 ABc + abCf 2 = 90.09 CrrCrr 0.00 0.03 AbC + aBcf 3 = 210.21 rCrrrCrr 0.30 0.30-0.03=0.27 Abc + aBCf 4 = 10.01 rCrrrCrr 0.10 0.10-0.03=0.07 HaplotypesObs. No.Freq. C=3.18Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 = 690.69 r r Cr r 1-0.22-0.30=0.48 1-0.22-0.30+0.066=0.546 ABc + abCf 2 = 90.09 rCrrrCrr 0.30 0.30-0.066=0.234 AbC + aBcf 3 = 210.21 CrrCrr 0.00 0.066 Abc + aBCf 4 = 10.01 rCrrrCrr 0.22 0.22-0.066=0.154 HaplotypesObs. No.Freq. C=0.45Exp. freq.Exp. freq. C=0Exp. freq. C=1 ABC + abcf 1 = 690.69 r r Cr r 1-0.22-0.10=0.68 1-0.22-0.10+0.022=0.702 ABc + abCf 2 = 90.09 rCrrrCrr 0.10 0.10-0.022=0.078 AbC + aBcf 3 = 210.21 rCrrrCrr 0.22 0.22-0.022=0.198 Abc + aBCf 4 = 10.01 CrrCrr 0.00 0.022 ABC ORDER BAC ORDER ACB ORDER
25
HaplotypesObs. No.p i, C=3.18p i, C=1 ABC + abcf 1 = 690.690.546 ABc + abCf 2 = 90.090.234 AbC + aBcf 3 = 210.210.066 Abc + aBCf 4 = 10.010.154 ABC ORDER
26
HaplotypesObs. No.p i, C=0.45p i, C=1 ABC + abcf 1 = 690.690.702 ABc + abCf 2 = 90.090.078 AbC + aBcf 3 = 210.210.198 Abc + aBCf 4 = 10.010.022 BAC ORDER
27
HaplotypesObs. No.p i, C=3.00p i, C=1 ABC + abcf 1 = 690.69 0.63 ABc + abCf 2 = 90.09 0.03 AbC + aBcf 3 = 210.21 0.27 Abc + aBCf 4 = 10.01 0.07 ACB ORDER
28
Likelihood method Unconstrained ModelConstrained Model OrderCLikelihoodLOD Likelihood C=1 LOD C=1 ABC3.18-36.76423.441-49.41310.793 BAC0.45-36.76423.441-37.00123.204 ACB3.00-36.76423.441-40.64819.558 The B-A-C order gives highest likelihood and LOD under a no interference C=1 model Most multipoint ML mapping algorithms use no interference models
29
Ordering Loci GMENDEL (Liu and Knapp 1990) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) PGRI (Lu and Liu 1995) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) or maximizes the likelihood. RECORD (Van Os et al. 2005) minimizes COUNT (Minimum Count of Crossover Events)
30
Ordering Loci JoinMap 4 (Van Ooijen, 2005) –minimizes the least square locus order using a stepwise search (regression) –Monte Carlo maximum likelihood (ML). Very fast computation of high density maps
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.