Statistical Genomics Zhiwu Zhang Washington State University Lecture 4: Statistical inference.

Slides:



Advertisements
Similar presentations
Probability models- the Normal especially.
Advertisements

Friday: Lab 3 & A3 due Mon Oct 1: Exam I  this room, 12 pm Please, no computers or smartphones Mon Oct 1: No grad seminar Next grad seminar: Wednesday,
Hypothesis testing Another judgment method of sampling data.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
12.The Chi-square Test and the Analysis of the Contingency Tables 12.1Contingency Table 12.2A Words of Caution about Chi-Square Test.
Hypothesis : Statement about a parameter Hypothesis testing : decision making procedure about the hypothesis Null hypothesis : the main hypothesis H 0.
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
Hypothesis Testing. G/RG/R Null Hypothesis: The means of the populations from which the samples were drawn are the same. The samples come from the same.
Inference about a Mean Part II
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
AM Recitation 2/10/11.
June 19, 2008Stat Lecture 12 - Testing 21 Introduction to Inference More on Hypothesis Tests Statistics Lecture 12.
Hypothesis Testing.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
8 - 1 © 2003 Pearson Prentice Hall Chi-Square (  2 ) Test of Variance.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
The Probability of a Type II Error and the Power of the Test
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
6.5 One and Two sample Inference for Proportions np>5; n(1-p)>5 n independent trials; X=# of successes p=probability of a success Estimate:
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.
Experimental Design and Statistics. Scientific Method
Elementary statistics for foresters Lecture 5 Socrates/Erasmus WAU Spring semester 2005/2006.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Chapter Outline Goodness of Fit test Test of Independence.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
MATH 4030 – 11A ANALYSIS OF R X C TABLES (GOODNESS-OF-FIT TEST) 1.
1 Math 4030 – 10b Inferences Concerning Proportions.
Correlation. u Definition u Formula Positive Correlation r =
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
1 Lecture 5: Section B Class Web page URL: Data used in some examples can be found in:
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Statistics for Decision Making Hypothesis Testing QM Fall 2003 Instructor: John Seydel, Ph.D.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Statistical Genomics Zhiwu Zhang Washington State University Lecture 9: Linkage Disequilibrium.
Chapter 9: Hypothesis Tests for One Population Mean 9.5 P-Values.
Lecture 4: Statistical inference
Chapter 9: Non-parametric Tests
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chapter 4. Inference about Process Quality
Statistics for Psychology
Lecture 6 Comparing Proportions (II)
When we free ourselves of desire,
Data Analysis for Two-Way Tables
Discrete Event Simulation - 4
Chapter 9: Hypothesis Tests Based on a Single Sample
Hypothesis Testing.
Statistical Inference for Managers
CHAPTER 12 Inference for Proportions
CHAPTER 12 Inference for Proportions
Chapter 24 Comparing Two Means.
Inference as Decision Section 10.4.
Statistical Inference for the Mean: t-test
Statistical Power.
Rest of lecture 4 (Chapter 5: pg ) Statistical Inferences
Presentation transcript:

Statistical Genomics Zhiwu Zhang Washington State University Lecture 4: Statistical inference

 Homework1, due Feb 3, Wednesday, 3:10PM Administration

 X2 test on contingency table  Empirical null distribution  X2 test on variance  t test  Hypothesis test  two types of error  Power Outline

TransgeneticNon transgeneticSUM Herbicide35540 No herbicide SUM Observed and expected frequency TransgeneticNon transgeneticSUM Herbicide No herbicide SUM

 Poisson distribution: Mean=Var=Expected  (Observed-Expected)/Sqrt(Expected) ~ N(0,1)  SUM(Observed-Expected) 2 / Expected ~ X 2 (df)  df=number of independent cells Approximate Distributions

TransgeneticNon transgeneticSUM Herbicide35540 No herbicide SUM Observed and expected frequency TransgeneticNon transgeneticSUM Herbicide No herbicide SUM /28+49/12+49/42+49/18=9.72

Distribution of x2(1) Observed 9.72 P<1% 99% percentile 6.97 par(mfrow=c(2,2),mar = c(3,4,1,1)) x=rchisq(k,1) d=density(x) plot(x) plot(d) hist(x) plot(ecdf(x)) quantile(x,.99)

 A sample has mean of and variance of  The sample has 10 observations  Q1: What is the probability that the sample was from a normal distribution with variance of 25?  Q2: What is the probability that the sample was from a normal distribution with mean of 100? Tests on samples

 Empirical solution:  Sample ten observations from a normal distribution with variance of 25.  Calculate observed variance.  Repeat the sampling and get null distribution of the sample variances  Find percentile of observed variance on the null distribution Q1: distribution with variance of 25

x=replicate(10000, {s=rnorm(10,0,5) var=var(s) }) Observed P>25% 75% percentile 31.6 > length(x[x>27.82])/10000 [1] par(mfrow=c(2,2),mar = c(3,4,1,1)) d=density(x) plot(x) plot(d) hist(x) plot(ecdf(x)) quantile(x,.75)

 Theoretical solution: Q1: distribution with variance of 25 v=(10-1)*27.82/25= > 1-pchisq(10.026,9) [1] vs from empirical

Q2: distribution with mean of 100  Empirical solution  Sample ten observations from N(100, 25)  Calculate mean  Repeat the process 10,000 times  Null distribution of of the 10,000 means  Determine the percentile of testing mean (103.6) on the null distribution

Q2: distribution with mean of 100 x=replicate(10000, {s=rnorm(10,100,5) m=mean(s) }) Observed %<P<5% 95% percentile > length(x[x>103.6])/10000 [1] par(mfrow=c(2,2),mar = c(3,4,1,1)) d=density(x) plot(x) plot(d) hist(x) plot(ecdf(x)) quantile(x,.95) quantile(x,.99) 99% percentile 102.6

t test

T=( )/(5/sqrt(10)) P=1-pt(T,9) c(T,P) Under 5% of threshold, reject the hypothesis that the sample was from a distribution with mean of 100

F test

 Null hypothesis (H0): Initial assumption  Alternative hypothesis (Ha): Opposite to the assumption  Find the probability of H0  If the probability is too low (e.g. 5%), reject Ho and accept Ha  Otherwise, accept Ho Hypothesis test

 Type I error: Reject true H0, False positive, the probability is the threshold used, e.g. α=5%  Type II error: Accept false H0, false negative, β  Power: Probability to reject false H0, (1-β) Two types of errors and power

TestH0 is TrueHo is False Positive (reject H0) False positive Type I: α Power=1-β Negative (Accept H0) Specificity=1-α False negative Type II: β Sum100% Summary

Highlight  X2 test on contingency table  Empirical null distribution  X2 test on variance  t test  Hypothesis test  two types of error  Power