Presentation is loading. Please wait.

Presentation is loading. Please wait.

Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium.

Similar presentations


Presentation on theme: "Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium."— Presentation transcript:

1 Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium map in human populations. Statistical Applications in Genetics and Molecular Biology 8 (1): Article 18.

2 Genetic Designs for Mapping Controlled crosses – Backcross, F2, full-sib family, … (linkage) Unrelated (random) individuals from a natural population (linkage disequilibrium) Cases and controls from a natural population Unrelated (random) families from a natural population (linkage and LD) Related (non-random) families from a natural population (linkage, LD and identical-by-descent) Family designs are increasingly used for genetic studies because of much information contained.

3 Natural Population Consider two SNPs 1 (with two allele A and a) and 2 (with two alleles B and b) The two SNPs are linked with recom. frac. r The two SNPs form four haplotypes, AB, Ab, aB, and ab Prob(A) = p, Prob(B) = q, linkage disequilibrium = D. We have haplotype frequencies as

4 Diagrammatic Presentation

5 Family Design: family number and size

6 Mating frequencies of families and offspring genotype frequencies per family

7 HWE assumed Can you figure out where this assumption is needed?

8 Segregation of double heterozygote Overall haplotype frequencies produced by this parent are calculated as 1/2ω 1 for AB or ab and 1/2ω 2 for Ab or aB

9 A Joint Probability Mother genotypes (M m ) Father genotypes (M f ) Offspring genotypes (M o ) P(M m,M f,M o )= P(M m,M f )P(M o |M m.M f ) = P(M m )P(M f )P(M o |M m,M f )

10 A joint two-stage log-likelihood Let unknown parameters

11 Upper-stage Likelihood

12 EM algorithm for Θ E step M step

13

14 Lower-stage Likelihood

15 EM algorithm for r E step - calculate the probability with which a considered haplotype produced by a double heterozygote parent is the recombinant type using

16 E step (cont’d) Calculate the probability with which a double heterozygote offspring carries recombinant haplotypes by

17 M step where m equals the sum of the following terms:

18

19 Hypothesis tests Linkage and Linkage disequilibrium H0: r = 0 and D = 0 H1: At least one equality does not hold LR = -2(log L0 – log L1) Critical threshold x 2 (df=2)

20 Hypothesis tests Sex-specific difference in population structure

21 Hypothesis test Sex-specific difference in the recombination fraction

22 Simulation

23

24 Power

25 Conclusions The model can jointly estimate the linkage and linkage disequilibrium between two markers - LD from parents - Linkage from offspring The model can draw a LD map to study the evolution of populations and high-resolution mapping of traits

26 Three-locus Analysis Marker segregation in a natural population: Three markers produce eight haplotypes: ABC, ABc, AbC, Abc, aBC, aBc, abC, and abc. Haplotype frequencies are P(A) = p,P(a) = 1 - p P(B) = q,P(b) = 1 - q P(C) = r,P(c) = 1 – r D AB = LD between markers A and B, D BC = LD between markers B and C, D AC = LD between markers A and C, D ABC = LD among markers A, B, and C

27

28 AaBbCc produces 8 types of gametes (haplotypes) which are classified into four groups Recombinant # between Frequency A and BB and C ABC and abc 00g 00 ABc and abC01g 01 aBC and Abc10g 10 AbC and aBc11g 11 Three-locus Analysis: Marker segregation in a family Consider a triple heterozygote AaBbCc

29 Matrix notation Markers B and C Markers A and BRecombinantNon-recombinantTotal Recombinantg 11 g 10 r AB Non-recombinantg 01 g 00 1-r AB Total r BC 1-r BC 1 What is the recombination fraction between A and C? r AC = g 01 + g 10 Thus, we have r AB = g 11 + g 10 r BC = g 11 + g 01 r AC = g 01 + g 10

30 Triple heterozygote may have four possible diplotypes, each producing eight haplotypes with frequencies given below:

31 AaBbCC may have two possible diplotypes, each producing four haplotypes with frequencies given below: How about AaBbcc AaBBCc AabbCc AABbCc aaBbCc How about AaBBCC and other genotypes with one marker being heterozygous?

32 Study design

33 For a parent with triple heterozygotic genotype AaBbCc, there will be four possible diplotypes, ABC|abc, Abc|abC, AbC|aBc or Abc|aBC, whose relative frequencies in the natural population are

34 These diplotypes will produce haplotypes ABC, ABc, AbC, Abc, aBC, aBc, abC, and abc, with the frequencies:

35

36 For a parent with double heterozygotic genotypes, the possible diplotypes and their according relative frequencies are listed here:

37 Let Note: theta’s are the recombination fraction

38 Upper- level Likelihood

39 EM algorithm E step: calculate the probability with which a double heterozygote parent carries a particular diplotype and a triple heterozygote parent carries a particular diplotype M step: estimate haplotype frequencies by

40 Lower-level likelihood

41 EM algorithm In the E step: The probabilities with which a considered haplotype produced by a double heterozygote or triple heterozygote parent is the recombinant type are calculated. In the M step: The estimates of crossover probabilities g's are obtained. Very complex – omitted here.

42 Simulation

43 Conclusions Three-point analysis provides the estimates of high-order LD and the pair-wise linkage (this helps to model genetic interference) r AC = r AB + r BC – 2cr AB r BC, where c is related to genetic interference Three-point analysis can provide the estimation of the linkage and linkage disequilibria as precisely as two-point analysis although more parameters need to be estimated for the former Three-point analysis can estimate the linkage when two markers are not associated (LD = 0).

44 Quantitative Genetic Analysis We now consider the genetic effects of haplotypes on complex phenotype

45 Study Design

46 Notation

47 Unifying Likelihood

48 The first part This can be estimated by the algorithm developed before

49 The second part

50 Risk Haplotype

51 Genetic effects

52

53

54

55 EM algorithm

56 M step

57 Hypothesis tests

58 Model selection

59 Simulation

60 Simulation with three markers

61 Power

62 Part of this lecture come from Dr. Qin Li’s dissertation.


Download ppt "Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium."

Similar presentations


Ads by Google