Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.

Similar presentations


Presentation on theme: "Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk."— Presentation transcript:

1 Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk

2 REVIEW Case-Control – Derivation VIII

3 CORRECTION Case-Control – Hypothesis Testing  Recall that the trait allele frequencies are set in stone to calculate the trait prevalence K.  Model 1 (HWE, no LE): There are 2n distinct haplotypes, thus there are 2n-2 degrees of freedom.  Restricted Model 0 (HWE, LE): There are n distinct alleles, thus there are n – 1 degrees of freedom.  2(lnL 1 – lnL 2 ) with n – 1 degrees of freedom tests for LE under the assumption of HWE.  Calculate the mle for model 1 with a modified EM.

4 Estimating Genetic Parameters   = p 1, p 2, f 11, f 12, f 22 are genetic parameters underlying the theoretical distribution of genotypes in the case-control approach.  When the genetic model and thus  are unknown, then one resorts to contingency tables.  Can the data be used to estimate  ?

5 Estimating Genetic Parameters  One could estimate the haplotype frequencies h 1i, h 2i simultaneously with the genetic parameters .  Then, 2[lnL(h 1i, h 2i,  ) – lnL(q i,  )] is a statistic for testing linkage equilibrium without conditioning on known genetic parameters.  However, the G statistics above has an unknown distribution because when there is linkage equilibrium, then the marker locus and disease locus are independent and L(q i,  ) is actually independent of .

6 Spurious Associations (4.6.4)  Population subdivision, or any of the other causes of linkage disequilibrium we discussed last time, can cause spurious associations, i.e. linkage disequilibrium not caused by tight linkage.  Population subdivision is probably the most common source of spurious associations.  Other sources of spurious association cannot be accommodated so easily, except to know your population and know what is greater than “normal” association in this population.

7 Population Subdivision – Identifying Subpopulations  Identify subpopulations where matings occur randomly. These are subpopulations which will differ in trait and marker allele frequencies. Sometimes, a priori information is available about subpopulations in which these allele frequencies differ.  Often subdivide by ethnicity, location, religion, social class, and age.

8 Population Subdivision - Sampling Designs  Sample only from one identified subdivision.  Match case and control by subdivision.  In complex traits, there may be multiple loci associated with a disease, and these loci may vary between subpopulations. Which sampling scheme do you recommend?

9 Hidden Population Stratification  One cannot anticipate all sources of spurious association.  Internal checks may indicate presence of remaining spurious association. Test HWE on individual markers. Test markers on different chromosomes for spurious association. Trait loci that associate tightly with multiple distant markers are a sign of trouble.

10 Using Families – Removing Spurious Association  The effect of spurious association can be removed by comparing the chromosomes of affected children to their relatives.  The most common relative to use? Parents.  This does NOT mean that we are returning to family-based linkage analysis. As you will see, we still use information from multiple generations of recombination.

11 Moving to Biallelic Model linkage equilibrium linkage disequilibrium

12 TDT – Assumptions  Depends on the presence of linkage disequilibrium at the population level.  Assumes random mating.

13 TDT – Genetic Model  AD Allele Frequencies P(A) = p A P(a) = 1 – p A P(D) = p D P(d) = 1 - p D Linkage Disequilibrium D AB = h AD - p A p D

14 TDT – Haplotype Frequencies

15 TDT – The Test  Assume we randomly sample affected individuals and then genotype that individual and his/her two parents for marker A.  Take those families where the parents are heterozygous for the marker.  Record the data as transmitted and nontransmitted alleles. A table as shown on the next slide is typically used.

16 TDT – The Table Nontransmitted AaTotal Transmitted A-t 12 - at 21 -- Total--2N2N N is the number of affected children sampled.

17 TDT – Filling the Table Aa AA n 12 += _____ n 21 += _____

18 TDT – Filling the Table Aa n 12 += _____ n 21 += _____

19 TDT – Statistic

20 TDT – Derivation Nontransmitted Transmitted Under H 0 the expected frequencies are equal.

21 TDT – Example  Search for Insulin-Dependent Diabetes Mellitus (IDDM) (Spielman et al. 1993).  94 families included in study  62 families had heterozygous parents at a marker on chromosome 11 with possible alleles “1” and “X”.  78 “1” alleles were transmitted to affected children. 124-78 = 46 “X” alleles were transmitted to affected children.

22 TDT – Example (cont) Nontransmitted 1XTotal Transmitted 1-78- X46-- Total--124

23 TDT - Power  How do we calculate the power of a TDT test? Make assumptions

24 TDT – Power (cont)  Statistical power is given by

25 TDT – Power (cont)  Power increases with sample size (number affected children).  Power increases with as recombination fraction decreases.  Power increases as linkage disequilibrium in population increases.  Power increases as trait allele frequency decreases (trait is rare).  Power is only slightly affected by marker allele frequencies.

26 TDT – Power Compared  TDT has lower power than a simple test for linkage disequilibrium in a random population sample.  TDT loses power by ignoring some of the data (only heterozygous parents considered) and because homozygous parents provide much information about linkage disequilibrium.  Why is TDT used then?

27 TDT – Advantages  TDT is a test for linkage and linkage disequilibrium, not just linkage disequilibrium.  Linkage disequilibrium from non-linkage sources can only change the genotypes of the parents.  TDT test transmission of heterozygous parents, and only linkage can result in significant result.  TDT can also detect segregation distortion at the marker locus. Another reason to check marker alleles for segregation distortion.

28 TDT – Advantages (cont) A D a D unlinked A D A d A D a D linked A D a D

29 Relative Risk Method  Analog to the general disequilibrium test on random population sample when dominant or recessive trait or marker (two genotype classes indistinguishable).  Observe two independent groups, defined by their marker genotype.  Determine the risk of being affected conditional on group P(affected | marker group).  Then, the relative risk is

30 Relative Risk – Data Group AA or AaaaTotal Status Affectedn 11 n 12 n 1+ Unaffectedn 21 n 22 n 2+ Totaln +1 n +2 2N2N

31 Relative Risk – Statistic

32 Relative Risk – Conditional Probabilities

33 Relative Risk – Null Distribution

34 Relative Risk – Statistical Test  Chi-squared test for independence on the table.  Likelihood ratio test: 2 degrees of freedom Group AA or AaaaTotal Status Affectedn 11 n 12 n 1+ Unaffectedn 21 n 22 n 2+ Totaln +1 n +2 2N2N

35 Haplotype Relative Risk ABBC BB case genotype: _____ control genotype: _____

36 Haplotype-Based HRR (HHRR)  Focus on alleles rather than genotypes.  There are two transmitted and two non-transmitted alleles in every pair of parents with one affected offspring.  Treat the two allele samples as independent case- control samples.

37 HHRR – II ABBC BB case alleles: _____ control alleles: _____

38 HHRR – III Untransmitted 12Total Transmitted 1t 11 t 12 t 1+ 2t 21 t 22 t 2+ Totalt +1 t +2 t

39 HRR & HHRR  Most powerful when linkage is 0.  Both assume random mating when they assume the parents provide an independent control genotype or alleles.  HHRR is more powerful than TDT because it uses information from homozygous parents.  HHRR, is valid test statistic for D AD = 0 and  =0.


Download ppt "Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk."

Similar presentations


Ads by Google