Chapter 9: Hypothesis Testing 9.1 Introduction to Hypothesis Testing Hypothesis testing is a tool you use to make decision from data. Something you usually do: 1. You make a statement about something. 2. You collect sample data relating to the statement. 3. If given that the statement is true, the sample outcome is unlikely, you realize that the statement probably is not true. For example, assume you want to know if a particular coin is fair. Your hypothesis is that it is a fair coin. You toss the coin 20 times and get heads 18 times. Since that is an unlikely outcome given that it is a fair coin, you reject the hypothesis that it is a fair coin.
9.2 Steps of Hypothesis Testing 1. Specify the null hypothesis ( 虛無假設,零假設, H 0 ) and the alternative hypothesis ( 對立假設, H 1 ). 2. What level of significance ( )? 3. Which test and test statistic? 4. State the decision rule. 5. Use the sample data to calculate the test statistic. 6. Use the test statistic to make a decision. 7. Interpret the decision in the context of the original question.
Assume that a teacher has a class of 28 students. She want to use the IQ score to demonstrate that her students are “above average.” The IQ scores are standardized to have a population mean of 100 and a standard deviation of 16. Step 1: Specify the Null Hypothesis and the Alternative Hypothesis The null hypothesis, H 0, is the statement we are interested in testing. The word null implies “nothing” or “non existent.” It indicates what would happen by chance or what would happen if there was no difference or no treatment effect. The alternative hypothesis, H 1, is the statement that we accept if our sample outcome leads us to reject the null hypothesis
For the classroom example, the statement is as follows: H 0 : = 100 (the students have an average IQ) H 1 : > 100 (the students have an above-average IQ) The null hypothesis always includes the equal condition. The teacher is interested in the condition that the students are above average. Then this condition is in the alternative hypothesis. The hypotheses are written in the form with a population parameter on the left and a numeric value on the right: H 0 : = 0 H 1 : > 0 0 = the hypothesized value, 100 in this example.
Step 2: What Level of Significance ( 顯著水準 ) The level of significance is the probability of rejecting the null hypothesis by chance alone. (What does it mean with = 0.05?) This could result from sampling error. Occasionally we get a sample just by chance ( 碰巧 ) that would lead us to reject the null hypothesis. It is possible to get 18 out of 20 heads with a perfectly fair coin, but the probability is very low or unlikely. The level of significance is our definition of unlikely. The traditional definition of unlikely is 5% of the time or less. ( 碰巧取到較極端的資料導致推翻正確 H o 的機率 ) What significance level should you use? If you want to be more certain that we are not falsely rejecting the null hypothesis, you can reduce the significance level to.01 or even lower. When in doubt, use the standard 5% level.
Type I and Type II errors The significance level is also called the probability of a type I error. A type I error occurs when you falsely reject the null hypothesis on the basis of sampling error. A type II error occurs when you fail to reject the null hypothesis when it is false.
Step 3: Which Test and Test Statistic? The test statistic ( 統計量 ) is the value calculated from the sample to determine whether to reject the null hypothesis. If we can assume a normal sampling distribution of means, we can calculate a z-value for a sampling distribution. For our IQ problem, a sample of 28 is enough for the central limit theorem to be valid, especially since we have reason to believe that the population distribution of IQs is close to normal. For out test of mean vs. hypothesized value, a z-test, the test statistic is To calculate this test statistic, we need to know the population standard deviation, . In this case, we know = 16.
Step 4: State the Decision Rule We reject the null hypothesis if the test statistic is larger than a critical value corresponding to the significance level in step 2. For = 0.05, the z-value corresponding to 0.05 in the upper tail of the normal curve is The decision rule is Reject H 0 if z > 1.645
(1) We were testing whether the IQ scores are greater than 100, a one-tailed test. We can also test: (2) The mean is less than 100, a one-tailed test. (3) Reject the hypothesis if the mean is either greater than or less than 100, a two-tailed test 5% one-tail upper 5% one-tail lower 5% two-tail z = z = z = z = %
Step 5: Use the Sample Data to Calculate the Test Statistic Assume that the mean IQ of the students is Our test statistic becomes
Step 6: Use the Test Statistic to Make a Decision We see that our z value of 1.85 is greater than the critical value 1.645, and so we reject the null hypothesis
Step 7: Interpret the Decision in the Context of the Original Question To say that a result is “statistically significant” means that it is more than by chance alone. The Concept of a p-value In our example, the probability of a z-value greater than 1.85 is This is called p-value of the test. The p-value is the probability of getting the sample result by chance alone if the null-hypothesis is actually true. In our case, the p-value is smaller than the level of significance. The decision rule can also be Reject H 0 if the p-value is less than can be any significance level. Note: For a two-tail test, you double one-tail p-value before comparing it to .
Test Statistic versus p-value Both methods require the calculation of a test statistic. The test statistic approach compares the value of the calculated test statistic to a critical value from a table; the p-value approach calculates the probability of the test statistic and compares it to the significance level, . If you are using a statistical software, you will get a p-value and you do not need to look up a critical value in a table. With the p-value approach, instead of just rejecting the null hypothesis (or not rejecting it), you will get a sense of how significant (or not significant) the results are. For example, if a test has a p-value of , it is very unlikely to have happened by chance alone.
Learning Activity Hypothesis Test Calculations Open Htest1.xls!Data. Use excel to calculate the z-value in the text Determine the p-value by looking up the Tables.xls Replicate the z-value and p-value by using MegaStat | Hypothesis Tests | Mean vs. Hypothesized Value Click “Summary Data” and select B3:B6 as the input Use MegaStat | Probability | Normal Distribution to replicate the figures below.
9.A Hypothesis Testing Simulation (Normal random numbers) Open HtestSim1.xls. This simulates the example we looked at this chapter. We rarely reject H 0 when the population mean is 100. When we do reject the H 0, it is an example of type I error (rejecting based on a lucky sample). You can change population mean to 110 and see the null hypothesis is rejected most of the time. When it is not rejected, it is a type II error (failing to reject when you should have since the population mean is greater than 100). Our example is H 0 : = 0 H 1 : > 0 0 = the hypothesized value, 100 in this example.