Chapter 8 Hypothesis Testing STATISTICS Chapter 8 Hypothesis Testing C.M. Pascual
Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Fundamentals of Hypothesis Testing 8-3 Testing a Claim about a Mean: Large Samples 8-4 Testing a Claim about a Mean: Small Samples 8-5 Testing a Claim about a Proportion 8-6 Testing a Claim about a Standard Deviation
Definition 8-1 Overview Hypothesis in statistics, is a claim or statement about a property of a population page 366 of text Various examples are provided below definition box
Rare Event Rule for Inferential Statistics If, under a given assumption, the probability of a particular observed event is exceptionally small, we conclude that the assumption is probably not correct. Example on page 366-367 of text Introduce the word ‘significant’ in regard to hypothesis testing.
Fundamentals of Hypothesis Testing 8-2 Fundamentals of Hypothesis Testing
Figure 8-1 Central Limit Theorem Example on page 368 of text. This is the drawing associated with that example.
Figure 8-1 Central Limit Theorem The Expected Distribution of Sample Means Assuming that = 98.6 Likely sample means µx = 98.6
Figure 8-1 Central Limit Theorem The Expected Distribution of Sample Means Assuming that = 98.6 Likely sample means z = - 1.96 x = 98.48 or z = 1.96 x = 98.72 µx = 98.6
Figure 8-1 Central Limit Theorem The Expected Distribution of Sample Means Assuming that = 98.6 Sample data: z = - 6.64 x = 98.20 Likely sample means or z = - 1.96 x = 98.48 or z = 1.96 x = 98.72 µx = 98.6
Components of a Formal Hypothesis Test page 369 of text
Null Hypothesis: H0 Statement about value of population parameter Must contain condition of equality =, , or Test the Null Hypothesis directly Reject H0 or fail to reject H0 Give examples of different wording for and Š, such as ‘at least’, ‘at most’, ‘no more than’, etc.
Alternative Hypothesis: H1 Must be true if H0 is false , <, > ‘opposite’ of Null Give examples of different ways to word °,< and >, such as ‘is different from’, ‘fewer than’, ‘more than’, etc.
Note about Forming Your Own Claims (Hypotheses) If you are conducting a study and want to use a hypothesis test to support your claim, the claim must be worded so that it becomes the alternative hypothesis. By examining the flowchart for the Wording of the Final Conclusion, Figure 7-4, page 375, this requirement for support of a statement becomes clear.
Note about Testing the Validity of Someone Else’s Claim Someone else’s claim may become the null hypothesis (because it contains equality), and it sometimes becomes the alternative hypothesis (because it does not contain equality). This is important to emphasize. A claim is not always the null statement. Because of the wording, it may become the alternative. Whichever statement the claim becomes (null or alternative), the other statement will be the ‘opposite’. Some examples should be given for starting with a claim, and then set up the null and alternative. One example is on page 370 and exercises 9 - 16 are appropriate.
Test Statistic a value computed from the sample data that is used in making the decision about the rejection of the null hypothesis page 371 of text
For large samples, testing claims about population means Test Statistic a value computed from the sample data that is used in making the decision about the rejection of the null hypothesis For large samples, testing claims about population means x - µx page 371-372 of text Example on page 372 of text z = n
Critical Region Set of all values of the test statistic that would cause a rejection of the null hypothesis page 372 of text
Critical Region Set of all values of the test statistic that would cause a rejection of the null hypothesis Critical Region
Critical Region Set of all values of the test statistic that would cause a rejection of the null hypothesis Critical Region
Critical Region Set of all values of the test statistic that would cause a rejection of the null hypothesis Critical Regions
Significance Level denoted by the probability that the test statistic will fall in the critical region when the null hypothesis is actually true. common choices are 0.05, 0.01, and 0.10 This is the same introduced in Section 6-2, where we defined the degree of confidence for a confidence interval to be the probability 1 -
Critical Value Value or values that separate the critical region (where we reject the null hypothesis) from the values of the test statistics that do not lead to a rejection of the null hypothesis page 372 of text
Critical Value Value or values that separate the critical region (where we reject the null hypothesis) from the values of the test statistics that do not lead to a rejection of the null hypothesis The critical value depends on the type of test being conducted. The text will start with critical values that are z scores. Later tests will have critical values that are t scores and X2 values. Example on page 373 of text. Critical Value ( z score )
Critical Value Value or values that separate the critical region (where we reject the null hypothesis) from the values of the test statistics that do not lead to a rejection of the null hypothesis Reject H0 Fail to reject H0 The critical value separates the curve into areas where one would reject the null (the critical region), and where one would fail to reject the null (the rest of the curve). Critical Value ( z score )
Two-tailed,Right-tailed, Left-tailed Tests The tails in a distribution are the extreme regions bounded by critical values. page 373 of text
Two-tailed Test H0: µ = 100 H1: µ 100
is divided equally between the two tails of the critical Two-tailed Test H0: µ = 100 H1: µ 100 is divided equally between the two tails of the critical region
is divided equally between the two tails of the critical Two-tailed Test H0: µ = 100 H1: µ 100 is divided equally between the two tails of the critical region Means less than or greater than Analysis of what the symbol ° means which helps students to realize this is a two tailed test.
is divided equally between the two tails of the critical Two-tailed Test H0: µ = 100 H1: µ 100 is divided equally between the two tails of the critical region Means less than or greater than Reject H0 Fail to reject H0 Reject H0 100 Values that differ significantly from 100
Right-tailed Test H0: µ 100 H1: µ > 100
Right-tailed Test H0: µ 100 H1: µ > 100 Points Right
Right-tailed Test H0: µ 100 H1: µ > 100 Points Right Values that Fail to reject H0 Reject H0 Values that differ significantly from 100 100
Left-tailed Test H0: µ 100 H1: µ < 100
Left-tailed Test H0: µ 100 H1: µ < 100 Points Left
Left-tailed Test H0: µ 100 H1: µ < 100 Points Left Values that Reject H0 Fail to reject H0 Values that differ significantly from 100 100
Conclusions in Hypothesis Testing always test the null hypothesis 1. Reject the H0 2. Fail to reject the H0 need to formulate correct wording of final conclusion See Figure 8-4 page 374 of text. Examples at bottom of page and top of page 375
FIGURE 8-4 Wording of Final Conclusion Start Does the original claim contain the condition of equality Yes (Reject H0) “There is sufficient evidence to warrant rejection of the claim that. . . (original claim).” (This is the only case in which the original claim is rejected). Yes (Original claim contains equality and becomes H0) Do you reject H0?. No (Fail to reject H0) “There is not sufficient evidence to warrant rejection of the claim that. . . (original claim).” No (Original claim does not contain equality and becomes H1) (This is the only case in which the original claim is supported). Do you reject H0? Yes (Reject H0) “The sample data supports the claim that . . . (original claim).” page 375 of text Also found on Formula and Table insert provided with text. Many instructors allow students to use this flowchart when taking exams. The conclusion process indicted in this chart will be used with other statistical processes in other chapters. Discussion of the two results that support or reject conclusions, whereas the other two do not provide enough evidence for support or rejection. No (Fail to reject H0) “There is not sufficient evidence to support the claim that. . . (original claim).”
Accept versus Fail to Reject some texts use “accept the null hypothesis we are not proving the null hypothesis sample evidence is not strong enough to warrant rejection (such as not enough evidence to convict a suspect) page 374 of text The term ‘accept’ is somewhat misleading, implying incorrectly that the null has been proven. The phrase ‘fail to reject’ represents the result more correctly.
Type I Error The mistake of rejecting the null hypothesis when it is true. (alpha) is used to represent the probability of a type I error Example: Rejecting a claim that the mean body temperature is 98.6 degrees when the mean really does equal 98.6 Example on page 375 of text
Type II Error the mistake of failing to reject the null hypothesis when it is false. ß (beta) is used to represent the probability of a type II error Example: Failing to reject the claim that the mean body temperature is 98.6 degrees when the mean is really different from 98.6
Table 8-2 Type I and Type II Errors True State of Nature The null hypothesis is true The null hypothesis is false Type I error (rejecting a true null hypothesis) We decide to reject the null hypothesis Correct decision Decision Type II error (rejecting a false null hypothesis) We fail to reject the null hypothesis Correct decision page 376 of text
Controlling Type I and Type II Errors For any fixed , an increase in the sample size n will cause a decrease in For any fixed sample size n , a decrease in will cause an increase in . Conversely, an increase in will cause a decrease in . To decrease both and , increase the sample size. page 377 in text
Power of a Hypothesis Test Definition Power of a Hypothesis Test is the probability (1 - ) of rejecting a false null hypothesis, which is computed by using a particular significance level and a particular value of the mean that is an alternative to the value assumed true in the null hypothesis.
Steps in Hypothesis Testing State the null and alternative hypothesis; Select the level of significance; Determine the critical value and the rejection region/s; State the decision rule; Compute the test statistics; and Make a decision, whether to reject or not to reject the null hypothesis.
Example 1 A manufacturer claims that the average lifetime of his lightbulbs is 3 years or 36 months. The stabdard deviation is 8 months. Fifty (50) bulbs are selected, and the average lifetime is found to be 32 months. Should the manufacturer’s statement be rejected at = 0.01?
Example 1 Solution: Step 1. State the hypothesis: Ho: µ = 36 months Ha : µ 36 months Step 2. Level of significance = 0.01 Step 3. Determine critical values and rejection region
Example 1 Solution: Step 3. Determine critical values and rejection region Z = +/- 2.575 (from Appendix B of z values) Step 4. State the decision rule Reject the null hypothesis if Zc > 2.575 or Zc = - 2.575
Example 1 zc = x - µx Solution: Step 5. Compute the test statistic.
Example 1 Solution: Step 6. Make a decision. Zc = - 3.54 is less than Z = -2.575 And it falls in the rejection region in the left tail. Therefore, reject Ho and conclude that the average lifetime of lightbulbs is not equal to 36 months.
Example 1 Solution: Step 6. Make a decision. Zc = - 3.54 is less than Z = -2.575 And it falls in the rejection region in the left tail. Therefore, reject Ho and conclude that the average lifetime of lightbulbs is not equal to 36 months.
Example 2 A test on car braking reaction times for men between 18 to 36 years old have produced a mean and standard deviation of 0.610 second and 0.123 second, respectively. When 40 male drivers of this age group were randomly selected and tested for their breaking reaction times, a mean of 0.587 second came out. At the = 0.10, test the claim of the driving instructor that his graduates had faster reaction times. Zc = - 3.54 is less than Z = -2.575 And it falls in the rejection region in the left tail. Therefore, reject Ho and conclude that the average lifetime of lightbulbs is not equal to 36 months.
Example 2 Solution: Step 1. State the hypothesis: Ho: µ = 0.610 second Ha: µ < 0.610 second Step 2. Level of significance = 0.10
Example 2 Solution: Step 3. Determine critical values and rejection region Z = - 1/.28 (from Appendix B of z values) Step 4. State the decision rule Reject the null hypothesis if Zc < - 1.28 .
Example 2 zc = x - µx Solution: Step 5. Compute the test statistic.
Example 2 Solution: Step 6. Since the test statistics falls within the non-critical region, do not reject Ho. There is enough evidence to support the instructor’s claim.; accept Ho. .
Test on Small Sample Mean The t-test is a statistical test for the mean of a population and is used when the population is normally distributed, σ is unknown, and n < 30. The formula for the t-test with degrees of freedom are d.f. = n – 1 is x - µ t = s n
Example 3 In order to increase customer service, a muffler repair shop claim its mechanics can replace a muffler in 12 minutes. A time management specialist selected 6 repair jobs and found their mean time to be 11.6 minutes. The standard deviation of the sample was 2.1 minutes. A = 0.025, is there enough evidence to conclude that the mean time in changing a muffler is less than 12 minutes?
Example 3 Solution State the hypothesis: Ho: µ = 12 Ha: µ < 12 Step 2. Level of significance = 0.025 Step 3. Since and d.f. = 6 – 1 = 5, then at = 0.025 , Appendix C at t-value = - 2.571
Example 1 Solution 4. Step 4. Reject Ho if tc < - 2.571 5. Compute for the test statistic x - µ t = s
Example 3 t = (11.6 – 12)/(2.1(6)0.5 = - 0.47 Step 6. Since the critical value fall within the non-critical region, do not reject Ho. Accept Ho.
Sit Work (Submit after class) A diet clinic states that there is an average loss of 24 pounds for those who stay on the program for 20 weeks. The standard deviation is 5 pounds. The clinic tries a new diet, reducing salt intake to see whether that strategy will produce a greater weight loss. A group of 40 volunteers loses an average of 16.3 pounds each over 20 weeks. Should the clinic change the new diet? Use = 0.05
Assignment (Submit next meeting) 2. A recent survey stated that household received an average 37 telephone calls per month. To test the claim, a researcher surveyed 29 households and found that the average number of calls was 34.9. The standard deviation of the sample was 6. At = 0.05, can the claim be substantiated?