Chapter 8 Hypothesis Testing “Could these observations really have occurred by chance?” Shannon Sprott GEOG /3/2010
Definition A statistical hypothesis test is a method of making statistical decisions using experimental data. Hypothesis testing is one of the most important tools of application of statistics to real life problems. There are five ingredients to any statistical test : (a) Null Hypothesis (H0) (b) Alternate Hypothesis (HA) (c) Test Statistic (d) Rejection/Critical Region (Level of Significance) (e) Conclusion
Steps to Hypothesis Testing Null Hypothesis ( H0) : It is a hypothesis which states that there is no difference between the procedures and is denoted by H 0. Always the null hypothesis is tested. Alternative Hypothesis ( HA) : It is a hypothesis which states that there is a difference between the procedures and is denoted by H A.
Steps to Hypothesis Testing cont… Test Statistic : It is the random variable X whose value is tested to arrive at a decision. Rejection Region : It is the part of the sample space (critical region) where the null hypothesis H 0 is rejected. The size of this region, is determined by the probability (a) of the sample point falling in the critical region when H 0 is true. a is also known as the level of significance, the probability of the value of the random variable falling in the critical region. Conclusion : If the test statistic falls in the rejection/critical region, H 0 is rejected, else H 0 is accepted.
Null/Alternate hypothesis Let’s say we have a hat with two kinds of numbers in it: some of the numbers are drawn from a standard normal distribution (i.e. 2 = 1) with mean μ = 0, and some of the numbers are drawn from a standard normal distribution with unknown mean. Now let’s say we take a number out of the hat. There are two hypotheses that are possible: H0: the null hypothesis. The number is from a standard normal distribution with μ = 0. HA: the alternative hypothesis. The number is not from a standard normal distribution with μ = 0.
Numbers drawn from two different standard normal distributions are thrown into hat.
In any testing situation, two kinds of error could occur: Type I (false positive). We reject the null hypothesis when it’s actually true. Type II (false negative). We accept the null hypothesis when it’s actually false.
The probability of committing a Type I error is typically denoted α, and the probability of a Type II error is denoted β. α: the probability of making a Type I error (false positive). β: the probability of making a Type II error (false negative). α is often called a significance level or sensitivity. Typically, we try to fix an accepted level, α of Type I error, and go on to find ways of minimizing the level of Type II error, β.
P - Value Probability statement which answers the question: If the null hypothesis were true, than what is the probability of observing a test statistic at least as extreme as the one observed. The lower the p-value, the less likely the result, assuming the null hypothesis, the more "significant" the result, in the sense of statistical significance. One often rejects a null hypothesis if the p-value is less than 0.05 or 0.01, corresponding to a 5% or 1% chance respectively of an outcome at least that extreme, given the null hypothesis.
Large Sample Significance Test for Proportions Step 1 HO : p = po HA : p > po, p < po, p ≠ po Step 2 Test statistic is z test Zobs = (p-hat - p) / √(p(1-p)/√n)
Step 3 P-Value will depend on which alternate hypothesis is relevant: Right Handed Left Handed Two Sided Step 4 Set Significance Value ex. (α =.01)
Small Sample Test for Population Mean Step 1 HO: µ ≥ mean HA: µ < mean Step 2 Test statistic is t test T = (x - μ) / SE (x)
Hypothesis Testing Equations One Sample z-test – The test statistic is a z-score (z) defined by the following equation: z = (p - P) / σ Two Sample z-test – The test statistic is a z-score (z) defined by the following equation. z = (p 1 - p 2 ) / SE One sample t-test – The test statistic is a t-score (t) defined by the following equation. t = (x - μ) / SE
Equations Cont……. Two sample t-test – The test statistic is a t-score (t) defined by the following equation. t = [ (x 1 - x 2 ) - d ] / SE Matched Pairs t-test – The test statistic is a t-score (t) defined by the following equation. t = [ (x 1 - x 2 ) - D ] / SE = (d - D) / SE Chi-Squared goodness of Fit Test – The test statistic is a chi-square random variable (Χ 2 ) defined by the following equation. Χ 2 = Σ [ (O i - E i ) 2 / E i ]
References html html Hypothesis Testing, Vincent A. Voelz: Johnson, R. A. and Bhattacharya, G. K, 1992 Statistics : Principles and Methods. 2nd Edition. John Wiley and Sons. The Little Handbook of Statistical Practice; Gerard E. Dallal, Ph.D Elementary Statistics for Geographers; James E. Burt, Gerald M. Barber, David L. Rigby