Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genome-wide association studies BNFO 601 Roshan. Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick.

Similar presentations


Presentation on theme: "Genome-wide association studies BNFO 601 Roshan. Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick."— Presentation transcript:

1 Genome-wide association studies BNFO 601 Roshan

2 Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick random humans with and without cancer (say breast cancer) –Perform SNP genotyping –Look for associated SNPs –Also called genome-wide association study

3 Case-control example Study of 100 people: –Case: 50 subjects with cancer –Control: 50 subjects without cancer Count number of alleles and form a contingency table Relative risk cannot be estimated from case-control design due to sampling issues. Therefore we use the odds ratio instead. 982Control 9010Case #Allele2#Allele1

4 Odds ratio Odds of allele 1 in cancer = (a/(a+b))/(b/(a+b)) = a/b = e Similarly odds of allele 1 in healthy = c/d = f Odds ratio of allele 1 in cancer vs healthy = e/f dcHealthy baCancer #Allele2#Allele1

5 Example Odds of allele 1 in case = 15/35 Odds of allele 1 in control = 2/48 Odds ratio of allele 1 in case vs control = (15/35)/(2/48) = 10.3 482Control 3515Case #Allele2#Allele1

6 Statistical test of association (P-values) P-value = probability of the observed data (or worse) under the null hypothesis Example: –Suppose we are given a series of coin-tosses –We feel that a biased coin produced the tosses –We can ask the following question: what is the probability that a fair coin produced the tosses? –If this probability is very small then we can say there is a small chance that a fair coin produced the observed tosses. –In this example the null hypothesis is the fair coin and the alternative hypothesis is the biased coin

7 Binomial distribution Bernoulli random variable: –Two outcomes: success of failure –Example: coin toss Binomial random variable: –Number of successes in a series of independent Bernoulli trials Example: –Probability of heads=0.5 –Given four coin tosses what is the probability of three heads? –Possible outcomes: HHHT, HHTH HTHH, HHHT –Each outcome has probability = 0.5^4 –Total probability = 4 * 0.5^4

8 Binomial distribution Bernoulli trial probability of success=p, probability of failure = 1-p Given n independent Bernoulli trials what is the probability of k successes? Binomial applet: http://www.stat.tamu.edu/~west/applets/binomialdemo.html

9 Hypothesis testing under Binomial hypothesis Null hypothesis: fair coin (probability of heads = probability of tails = 0.5) Data: HHHHTHTHHHHHHHTHTHTH P-value under null hypothesis = probability that #heads >= 15 This probability is 0.021 Since it is below 0.05 we can reject the null hypothesis

10 Null hypothesis for case control contingency table We have two random variables: –X: disease status –A: allele type. Null hypothesis: the two variables are independent of each other (unrelated) Under independence –P(X=case and A=1)= P(X=case)P(A=1) Expected number of cases with allele 1 is –P(X=case)P(A=1)N –where N is total observations P(X=case)=(a+b)/N P(A=1)=(a+c)/N What is expected number of controls with allele 2? Do the probabilities sum to 1? dccontrol bacase #allele2#allele1

11 Chi-square statistic O i = observed frequency for i th outcome E i = expected frequency for i th outcome n = total outcomes The probability distribution of this statistic is given by the chi-square distribution with n-1 degrees of freedom. Proof can be found at http://ocw.mit.edu/NR/rdonlyres/Mathematics/18-443Fall2003/4226DF27-A1D0-4BB8-939A-B2A4167B5480/0/lec23.pdf

12 Chi-square Using chi-square we can test how well do observed values fit expected values computed under the independence hypothesis We can also test for the data under multinomial or multivariate normal distribution with probabilities given by the independence assumption. This would require cumulative distribution functions of multinomial and multi- variate normal which are hard to compute. Chi-square p-values are easier to compute

13 Case control dccontrol bacase #allele2#allele1 E1: expected cases with allele 1 E2: expected cases with allele 2 E3: expected controls with allele 1 E4: expected controls with allele 2 N = a + b + c + d E1 = ((a+b)/N)((a+c)/N) N = (a+b)(a+c)/N E2 = (a+b)(b+d)/N E3 = (c+d)(a+c)/N E4 = (c+d)(b+d)/N Now compute chi-square statistic

14 Chi-square statistic 482Control 3515Case #Allele2#Allele1 Compute expected values and chi-square statistic Compute chi-square p-value by referring to chi-square distribution


Download ppt "Genome-wide association studies BNFO 601 Roshan. Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick."

Similar presentations


Ads by Google