Genome-wide association studies BNFO 601 Roshan. Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick.

Slides:



Advertisements
Similar presentations
Presentation on Probability Distribution * Binomial * Chi-square
Advertisements

Quantitative Skills 4: The Chi-Square Test
Genome-wide association studies BNFO 602 Roshan. Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick.
What is a χ2 (Chi-square) test used for?
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
Applications of the Chi-Square Statistic Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
The Normal Distribution. n = 20,290  =  = Population.
Genome-wide association studies Usman Roshan. SNP Single nucleotide polymorphism Specific position and specific chromosome.
12.The Chi-square Test and the Analysis of the Contingency Tables 12.1Contingency Table 12.2A Words of Caution about Chi-Square Test.
Ch. 28 Chi-square test Used when the data are frequencies (counts) or proportions for 2 or more groups. Example 1.
LARGE SAMPLE TESTS ON PROPORTIONS
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Single nucleotide polymorphisms Usman Roshan. SNPs DNA sequence variations that occur when a single nucleotide is altered. Must be present in at least.
BNFO 602 Lecture 2 Usman Roshan. Bioinformatics problems Sequence alignment: oldest and still actively studied Genome-wide association studies: new problem,
Genome-wide association studies Usman Roshan. Recap Single nucleotide polymorphism Genome wide association studies –Relative risk, odds risk (or odds.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Single nucleotide polymorphisms and applications Usman Roshan BNFO 601.
Comparing Population Parameters (Z-test, t-tests and Chi-Square test) Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director,
Goodness of Fit Test for Proportions of Multinomial Population Chi-square distribution Hypotheses test/Goodness of fit test.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
+ Quantitative Statistics: Chi-Square ScWk 242 – Session 7 Slides.
STATISTICAL INFERENCE PART VII
Binomial distribution Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
Binomial Distributions Calculating the Probability of Success.
For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Testing means, part II The paired t-test. Outline of lecture Options in statistics –sometimes there is more than one option One-sample t-test: review.
Genome-wide association studies Usman Roshan. SNP Single nucleotide polymorphism Specific position and specific chromosome.
Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.
Binomial Experiment A binomial experiment (also known as a Bernoulli trial) is a statistical experiment that has the following properties:
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
The binomial applied: absolute and relative risks, chi-square.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
FPP 28 Chi-square test. More types of inference for nominal variables Nominal data is categorical with more than two categories Compare observed frequencies.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
Chi square analysis Just when you thought statistics was over!!
Test of Goodness of Fit Lecture 43 Section 14.1 – 14.3 Fri, Apr 8, 2005.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Chapter Outline Goodness of Fit test Test of Independence.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Dan Piett STAT West Virginia University Lecture 12.
More Contingency Tables & Paired Categorical Data Lecture 8.
Chapter 14 Chi-Square Tests.  Hypothesis testing procedures for nominal variables (whose values are categories)  Focus on the number of people in different.
Statistics 300: Elementary Statistics Section 11-2.
The Binomial Distribution.  If a coin is tossed 4 times the possibilities of combinations are  HHHH  HHHT, HHTH, HTHH, THHHH  HHTT,HTHT, HTTH, THHT,
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
The p-value approach to Hypothesis Testing
Genome-wide association studies
III. Statistics and chi-square How do you know if your data fits your hypothesis? (3:1, 9:3:3:1, etc.) For example, suppose you get the following data.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Chi-Square (χ 2 ) Analysis Statistical Analysis of Genetic Data.
Chi Square Chi square is employed to test the difference between an actual sample and another hypothetical or previously established distribution such.
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
Applied statistics Usman Roshan.
The Chi-square Statistic
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Applied statistics Usman Roshan.
Active Learning Lecture Slides
The Binomial and Geometric Distributions
Statistical Analysis Chi-Square.
Overview and Chi-Square
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
Applied Statistical and Optimization Models
Presentation transcript:

Genome-wide association studies BNFO 601 Roshan

Application of SNPs: association with disease Experimental design to detect cancer associated SNPs: –Pick random humans with and without cancer (say breast cancer) –Perform SNP genotyping –Look for associated SNPs –Also called genome-wide association study

Case-control example Study of 100 people: –Case: 50 subjects with cancer –Control: 50 subjects without cancer Count number of alleles and form a contingency table Relative risk cannot be estimated from case-control design due to sampling issues. Therefore we use the odds ratio instead. 982Control 9010Case #Allele2#Allele1

Odds ratio Odds of allele 1 in cancer = (a/(a+b))/(b/(a+b)) = a/b = e Similarly odds of allele 1 in healthy = c/d = f Odds ratio of allele 1 in cancer vs healthy = e/f dcHealthy baCancer #Allele2#Allele1

Example Odds of allele 1 in case = 15/35 Odds of allele 1 in control = 2/48 Odds ratio of allele 1 in case vs control = (15/35)/(2/48) = Control 3515Case #Allele2#Allele1

Statistical test of association (P-values) P-value = probability of the observed data (or worse) under the null hypothesis Example: –Suppose we are given a series of coin-tosses –We feel that a biased coin produced the tosses –We can ask the following question: what is the probability that a fair coin produced the tosses? –If this probability is very small then we can say there is a small chance that a fair coin produced the observed tosses. –In this example the null hypothesis is the fair coin and the alternative hypothesis is the biased coin

Binomial distribution Bernoulli random variable: –Two outcomes: success of failure –Example: coin toss Binomial random variable: –Number of successes in a series of independent Bernoulli trials Example: –Probability of heads=0.5 –Given four coin tosses what is the probability of three heads? –Possible outcomes: HHHT, HHTH HTHH, HHHT –Each outcome has probability = 0.5^4 –Total probability = 4 * 0.5^4

Binomial distribution Bernoulli trial probability of success=p, probability of failure = 1-p Given n independent Bernoulli trials what is the probability of k successes? Binomial applet:

Hypothesis testing under Binomial hypothesis Null hypothesis: fair coin (probability of heads = probability of tails = 0.5) Data: HHHHTHTHHHHHHHTHTHTH P-value under null hypothesis = probability that #heads >= 15 This probability is Since it is below 0.05 we can reject the null hypothesis

Null hypothesis for case control contingency table We have two random variables: –X: disease status –A: allele type. Null hypothesis: the two variables are independent of each other (unrelated) Under independence –P(X=case and A=1)= P(X=case)P(A=1) Expected number of cases with allele 1 is –P(X=case)P(A=1)N –where N is total observations P(X=case)=(a+b)/N P(A=1)=(a+c)/N What is expected number of controls with allele 2? Do the probabilities sum to 1? dccontrol bacase #allele2#allele1

Chi-square statistic O i = observed frequency for i th outcome E i = expected frequency for i th outcome n = total outcomes The probability distribution of this statistic is given by the chi-square distribution with n-1 degrees of freedom. Proof can be found at

Chi-square Using chi-square we can test how well do observed values fit expected values computed under the independence hypothesis We can also test for the data under multinomial or multivariate normal distribution with probabilities given by the independence assumption. This would require cumulative distribution functions of multinomial and multi- variate normal which are hard to compute. Chi-square p-values are easier to compute

Case control dccontrol bacase #allele2#allele1 E1: expected cases with allele 1 E2: expected cases with allele 2 E3: expected controls with allele 1 E4: expected controls with allele 2 N = a + b + c + d E1 = ((a+b)/N)((a+c)/N) N = (a+b)(a+c)/N E2 = (a+b)(b+d)/N E3 = (c+d)(a+c)/N E4 = (c+d)(b+d)/N Now compute chi-square statistic

Chi-square statistic 482Control 3515Case #Allele2#Allele1 Compute expected values and chi-square statistic Compute chi-square p-value by referring to chi-square distribution