Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to QTL analysis Peter Visscher University of Edinburgh

Similar presentations


Presentation on theme: "Introduction to QTL analysis Peter Visscher University of Edinburgh"— Presentation transcript:

1 Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk

2 Overview Principles of QTL mapping QTL mapping using sibpairs IBD estimation from marker data Improving power –ML variance components –Selective genotyping –Large(r) pedigrees

3 t [Fisher, Wright] Quantitative Trait Locus = a segment of DNA that affects a quantitative trait

4 Mapping QTL Determining the position of a locus causing variation in the genome. Estimating the effect of the alleles and mode of action.

5 Why map QTL ? To provide knowledge towards a fundamental understanding of individual gene actions and interactions To enable positional cloning of the gene To improve breeding value estimation and selection response through marker assisted selection (plants, animals) Science; Medicine; Agriculture

6 Principles of QTL mapping Co-segregation of QTL alleles and linked marker alleles in pedigrees Unobserved QTL alleles q m Q M Observed marker alleles pair of chromosomes

7 Linkage = Co-segregation A2A4A2A4 A3A4A3A4 A1A3A1A3 A1A2A1A2 A2A3A2A3 A1A2A1A2 A1A4A1A4 A3A4A3A4 A3A2A3A2 Marker allele A 1 cosegregates with dominant disease

8 Recombination A1A1 A2A2 Q1Q1 Q2Q2 A1A1 A2A2 Q1Q1 Q2Q2 A1A1 A2A2 Q1Q1 Q2Q2 Likely gametes (Non-recombinants) Unlikely gametes (Recombinants) Parental genotypes “Linkage analysis = counting recombinants"

9 Map distance Map distance between two loci (Morgans) = Expected number of crossovers per meiosis Note: Map distances are additive. Recombination frequencies are not. 1 Morgan = 100 cM; 1 cM ~ 1 Mb

10 Recombination & map distance Haldane (1919) Map Function

11 Principles of QTL mapping Co-segregation of phenotypes and genotypes in pedigrees –Genetic markers give information on IBD sharing between relatives [genotypes] –Association between phenotypes and genotypes gives information on QTL location and effect [linkage] Need informative mapping population

12 Mapping populations

13 Informative pig pedigree X © Roslin Institute QQqq QQQqqq

14

15 Line cross Only two QTL alleles segregating QTL effect can be estimated as the mean difference between genotype groups Power depends on sample size & effect of QTL Ascertain divergent lines Resolution of QTL map is low: ~10-40 Mb

16  =0.0001, power = 90%, F 2 population

17 Outbred populations: Complications  Markers not fully informative (segregating in the parental generation)  QTL not segregating in all families  (All F 1 segregate in inbred line cross)  Association between marker and QTL at the family rather than population level  (i.e. linkage phase differs between families)  Additional variance between families due to other loci

18 Line cross vs. outbred population CrossOutbred # QTL alleles2  2 # Generations3  2 Required sample size100s1000s QTL EstimationMeanVariance

19 QTL as a random effect y i =  +Q i +A i +E i Q i =QTL genotype contribution for chrom. segment A i =Contribution from rest of genome var(y)=  q 2 +  a 2 +  e 2

20 Logical extension of linear models used during the course This week: partitioning (co)variances into (causal) components QTL mapping: partitioning genetic variance into underlying components –Linkage analysis: dissecting within-family genetic variation

21 Genetic covariance between relatives cov(y i,y j )=  ij  q 2 +a ij  a 2 a ij =average prop. of alleles shared in the genome (kinship matrix)  ij =proportion of alleles IBD at QTL (0, ½ or 1) E(  ij )= a ij

22   ij = Pr(2 alleles IBD) + ½Pr(1 allele IBD) = proportion of alleles IBD in non- inbred pedigree Estimate  ij with genetic markers

23 Fully informative marker Determine IBD sharing between sibpairs unambiguously Example: Dad = 1/2 Mum= 3/4 –Transmitted allele from Dad is either 1 or 2 –Transmitted allele from Mum is either 3 or 4

24 Sibpairs & fully informative marker # Alleles IBD  Pr. 00¼ 1½ ½ 21¼ E(  ) =   Pr(  ) = ½ E(  2 ) =   2 Pr(  ) = 3 / 8 var(  )= E(  2 ) – E(  ) 2 = 1 / 8 CV = 0.5  2 = 70%

25 Haseman-Elston (1972) “The more alleles pairs of relatives share at a QTL, the greater their phenotypic similarity” or “The more alleles they share IBD, the smaller the difference in their phenotype”

26 Population sib-pair trait distribution

27 No linkage

28 Under linkage

29 Sib pair (or DZ twins) design to map QTL Multiple ‘families’ of two (or more) sibs Phenotypes on sibs Marker genotypes on sibs (& parents) Correlate phenotypes and genotypes of sibs

30 Data structure is simple PairPhenotypesProp. alleles IBD 1y 11 y 12  1 2y 21 y 22  2..... ny n1 y n2  n  =0, ½ or 1 for fully informative markers

31 Notation Y D=(y 1 – y 2 ) D 2 =(y 1 – y 2 ) 2 S=[(y 1 –  ) + (y 2 –  )] S 2 =[(y 1 –  ) + (y 2 –  )] 2 CP=(y 1 –  )(y 2 –  )

32 Proposed analysis…... DataMethodReference y 1 & y 2 ML ‘LOD’Parametric linkage analysis D 2 RegressionHaseman & Elston (1972) D 2 & S 2 RegressionDrigalenko (1998) Xu et al. (2000); Sham & Purcell (2001); Forrest (2001) CPRegressionElston et al. (2000) y 1 & y 2 ML VCGoldgar (1990); Schork (1993) DMLKruglyak & Lander (1995) D & SML VCFulker & Cherny (1996); Wright (1997)

33 Properties of squared differences E(Y 1 – Y 2 ) 2 = var(Y 1 – Y 2 ) + (E(Y 1 – Y 2 )) 2 var(Y 1 – Y 2 ) = var(Y 1 ) + var(Y 2 ) -2cov(Y 1,Y 2 ) If E(Y i ) = 0 and var(Y 1 )=var(Y 2 ), then E(Y 1 – Y 2 ) 2 = 2(1-r)var(Y)

34 Haseman-Elston method Phenotype on relative pair j: Y j =(y 1j -y 2j ) 2 E(Y i )=E[(Q 1j - Q 2j + A 1j - A 2j + (e 1j - e 2j ) 2 ] =E[(Q 1j - Q 2j ) 2 ] + {2(1-a ij )  a 2 + 2  e 2 } =2[  q 2 - cov(Q 1j,Q 2j )] + {   2 } =(2  q 2 +   2 ) - 2  jt  q 2  jt =proportion of alleles IBD at QTL (trait, t) for relative pair j

35 Conditional expectation E(Y j |  jt )=(2  q 2 +   2 )-  jt 2  q 2 negative slope of Y on  if  q 2 > 0 estimate  jt from marker data (  jm ) use simple linear regression to detect QTL: E(Y j |  jm )=  +  jm

36 A significant negative slope indicates linkage to a QTL

37 Single fully informative marker  =-2(1 - 2r) 2  q 2 (1 - 2r) 2  q 2 term is analogous to variance explained by a single marker in a backcross/F 2 design  =2[1 - 2(1-r)r]  q 2 +   2 r=recombination fraction between marker & QTL Statistical test:  = 0 versus  < 0 Disadvantage of method –not powerful –confounding between QTL location and effect

38 Interval mapping for sibpair analysis (Fulker & Cardon, 1994) Estimate  jt from IBD status at flanking markers Allows genome screen, separating effect & location –regression with largest R 2 indicates map position of QTL

39 Example from Cardon et al. (1994) [Lynch & Walsh, page 520]

40 Calculating  jt |  jm For  jt midway between two flanking markers:  jt ~r 2 /c + ½[(1 - 2r)/c]  jm1 + ½[(1 - 2r)/c]  jm2 c=1-2r+2r 2 r=recombination fraction between markers  jmk =  jm at flanking marker k Assumption: flanking markers are fully informative

41 Examples rc  jt 0.50.50.5 0.217/25(2/34) + (15/34)  jm1 + (15/34)  jm2 [if  jm1 and  jm2 are 1,  jt = 32/34 < 1]

42 Exercise Calculate  jt for a location midway between two markers that are 30 cM apart, when the proportion of alleles shared at the flanking markers are 1.0 and 0.5. Use the Haldane mapping function to calculate the recombination rate between the markers.  jm1 = 1,  jm2 = 0.5

43 Extensions to Haseman-Elston method Interval mapping Alternative models –QTL with dominance Other methods to estimate  jt –Using all markers on a chromosome ( Merlin ) –Monte Carlo sampling methods –Using both markers info & phenotypic info Add linkage information from: –Z j =[(y 1j -  ) + (y 2j -  )] 2

44 Power = 90%. Type-I error = 10 -5

45 Estimating  when marker is not fully informative Using: –Mendelian segregation rules –Marker allele frequencies in the population

46 IBD can be trivial… 1 11 1 / 22 / 2 / 2 / IBD=0

47 Two Other Simple Cases… 1 11 1 / 2 / 2 / 11 / 112 / 2 / IBD=2 22 / 22 /

48 A little more complicated… 12 / IBD=1 (50% chance) 22 / 12 / 12 / IBD=2 (50% chance)

49 And even more complicated… 11 / IBD=? 11 /

50 Bayes Theorem for IBD Probabilities prior Prob(data) posterior

51 P(Marker Genotype|IBD State) [Assumes Hardy-Weinberg proportions of genotypes in the population]

52 Worked Example 11 / 11 /

53 Exercise 12 / 12 /

54 Using multiple markers Mendelian segregation rules Marker allele frequencies in the population Linkage between markers Efficient multi-marker (multi-point) algorithms available (e.g., Merlin, Genehunter )

55 Software for QTL analysis of sibpairs Mx Merlin Genehunter S.A.G.E. ($) QTL Express (regression) Solar (complex pedigrees) Lots of others… http://www.nslij-genetics.org/soft/

56 George Seaton, Sara Knott, Chris Haley, Peter Visscher Roslin Institute University of Edinburgh http://QTL.cap.ed.ac.uk/ QTL Express: User-friendly web-based software to map QTL in outbred populations

57 Conclusions (sibpairs) Power of sib pair design is low –more relative pairs needed more contrasts e.g. extended pedigrees selective genotyping –extreme phenotypes are most informative for linkage –more powerful analysis methods ML variance component analysis

58 Maximum likelihood for sibpairs (assuming bivariate normality |  & fully informative marker) Full model: -2ln(L) =  n  ln|V  | +  (y-  )V  -1 (y-  ) V  =f 2 + q 2 + r 2 f 2 +  q 2 f 2 +  q 2 f 2 + q 2 + r 2

59 Maximum likelihood Reduced model: -2ln(L)=nln|V|+(y-  )V -1 (y-  ) V=f 2 + r 2 f 2 f 2 f 2 + r 2

60 Test statistic LRT = 2ln(ML full ) - 2ln(ML reduced ) H 0 (q 2 =0): LRT ~ ½  2 (1) + ½(0)

61 [Fisher et al. 1999] Example: QTL analysis for dyslexia on chromosome 6p using sib-pairs Phenotype: Irregular word test 181 sib-pairs ~15 Mb

62 or distribution approach in analysis? Expectation approach: use Distribution approach: use IBD probabilities and mixture distribution

63 Selective genotyping & sibpairs Concordant pairs –both sibs in upper or lower tail of the phenotypic distribution Discordant pairs –one sib in upper tail, other in lower tail Powerful design –requires many (cheap) phenotypes

64 Anxiety QTLs [Fullerton et al. 2003] Selection from ~30,000 sibpairs

65 Results [Fullerton et al. 2003] ~5 QTLs detected

66 Variance component analysis in complex pedigrees Partition observed variation in quantitative traits into causal components, e.g., –Polygenic –Common environment (‘household’) –QTL –Residual, including measurement error IBD proportions (  ) estimated from multiple markers “ACEQ” model

67 [Blackwood et al. 1996] Bipolar pedigree

68 Blackwood et al. (1996) data

69 Example: QTL analysis for BMI using a complex pedigree [Deng et al. 2002]


Download ppt "Introduction to QTL analysis Peter Visscher University of Edinburgh"

Similar presentations


Ads by Google