Presentation is loading. Please wait.

Presentation is loading. Please wait.

S TATISTICS Part IIIA. Hypothesis Testing. 3A.2 Hypothesis Testing n Similar techniques to last block of study on estimation procedures. There, we made.

Similar presentations


Presentation on theme: "S TATISTICS Part IIIA. Hypothesis Testing. 3A.2 Hypothesis Testing n Similar techniques to last block of study on estimation procedures. There, we made."— Presentation transcript:

1 S TATISTICS Part IIIA. Hypothesis Testing

2 3A.2 Hypothesis Testing n Similar techniques to last block of study on estimation procedures. There, we made interval estimates of parameters without preconceptions. n In Hypothesis testing someone is making a claim or challenging a proposition about a parameter’s value. n Examples: n What is the mean weight of a medium-sized box of Cheerios? Build a C.I.! n General Mills claims a medium-sized box of Cheerios has a mean weight of 16 ounces. You wish to challenge this claim. Do hypothesis testing!

3 3A.3 Two Opposing Hypotheses n The null hypothesis, symbolized by H O, is a description of the status quo, or what people have long believed to be true. If the null hypothesis is not rejected, no action needs to be taken. n The alternative hypothesis, symbolized by H A, is a vehicle for challenging or contradicting the conventional wisdom. Accepting the alternative hypothesis calls for action.

4 3A.4Examples n H O : Autonetics meets our quality specifications H A : Autonetics doesn’t meet our quality specifications n H O : Cancer rates for people who live close to high voltage power lines are not greater than for people who don’t. H A : Cancer rates for people who live close to high voltage power lines are greater than for people who don’t.

5 3A.5 The key to setting up hypotheses: n That which is contrary to what is commonly believed or accepted goes into the Alternative. n That which calls for action goes into the Alternative. n That which you are trying to demonstrate or that which you, the researcher, believe goes into the Alternative.* n “Everything else” goes into the Null. n The “equal” sign always goes with the Null.

6 3A.6 One-sided versus two-sided n H O : μ = 63.25 H A : μ  63.25 (A two-sided or two-tailed test) n H O : μ ≥ 30.0 H A : μ < 30.0 (A one-sided or one-tailed test, also called a left-tailed test) n H O : μ ≤ 4.0 H A : μ > 4.0 (A one-sided or one-tailed test, also called a right-tailed test)

7 3A.7Examples n One of your suppliers claims that he fills and ships orders, on average, within six days of receipt. You believe his average time is longer than six days. H O : μ ≤ 6.0 H A : μ > 6.0 n A manufacturer claims that a bottling machine fills each bottle precisely with 2000 cubic centimeters of cola. You want to test his claim. H O : μ = 2000 H A : μ  2000 n Research shows that Web surfers will lose interest in a Web page if downloading takes more than 6 seconds at 56K. You wish to test the effectiveness of a newly designed Web page in regard to its download time. H O : μ ≤ 6 H A : μ > 6

8 3A.8 The Nature of Hypothesis Testing n We formulate two opposing hypotheses n We develop sample information n We assess whether sample information is sufficiently strong to allow us to reject the null. n This process is very similar to the American system of jurisprudence...

9 3A.9 Citizens are presumed innocent n H O : The defendant is innocent H A : The defendant is guilty If, based upon evidence during the trial, the jury cannot find the defendant guilty beyond a reasonable doubt, what do they declare? “We find the defendant not guilty.” Does the jury accept the null hypothesis?

10 3A.10 n In hypothesis testing, we never accept the null hypothesis! We may fail to reject it but we never accept it. n We simply state, “Based upon sample information I cannot reject the null hypothesis beyond a reasonable doubt.” Note: When we reject the Null, we do accept the alternative.

11 3A.11 ‘The Reasonable Doubt’ n Let H O : μ = 400.0 n Let’s assume that this null really is true. n Let’s take a sample of size 36 from this population n What’s the expected value of X? n What if X turns out to be 396.9? 401.4? 407.0? 382.1? n The crux of hypothesis testing: How far must a sample statistic be from the null hypothesis before we go beyond ‘reasonable doubt’ and reject the null? n We define beyond ‘reasonable doubt’ by setting up “tails of rejection.”

12 3A.12 “ Alpha” (  ) defines tails of rejection.025 μ = 400 407382.1  Let  =.05 If X = 407 we would not reject the null If X = 382.1 we would reject the null X

13 3A.13 “ Alpha” (  ) defines tails of rejection.025 μ = 400 407382.1  Let  =.05 What is the probability of rejecting a true null? X

14 3A.14 “ Alpha” (  ) defines tails of rejection.025 μ = 400 407382.1  Let  =.05 Larger values of α make it easier to reject the null X

15 3A.15 TYPE I and TYPE II Errors

16 3A.16 Review of the four possible outcomes of hypothesis testing: n The Null Hypothesis is true but we reject it. What is the type of error and what is the probability of this occurrence? n Type I error. Probability is  and is called the level of significance. n The Null Hypothesis is true and we fail to reject it. What is the type of error and what is the probability of this occurrence? n No error! The probability of failing to reject a true null is 1 -  and is called the level of confidence. n The Null Hypothesis is false but we fail to reject it. What is the type of error and what is the probability of this occurrence? n Type II error. Probability is . n The Null Hypothesis is false and we reject it. What is the type of error and what is the probability of this occurrence? n No error! The probability of rejecting a false null is 1 -  and is called “the power of the test.”

17 3A.17 The Trade-off Between Type I & Type II Error Fail to reject H o

18 3A.18 Relationship between α and β n The “teeter totter”  α β This relationship holds for a given sample size.

19 3A.19 Relationship between α and β n Why not set α at zero? n We would never reject the null whether it be true or false! n In this case only, α and β add up to 1.0 n Setting α influences β but knowing α does not reveal β (except when α is 0.0). n We can reduce the risks of both Type I and Type II errors at the same time by increasing sample size.

20 3A.20 Where should we set α ? n If the risk of a Type I error is greater than the risk of a Type II error, set α small,.01 for example. n If the risk of a Type II error is greater, make α relatively large,.10 for example. n If the risks of the two types of errors are equal, set α at.05 (most common). S’pose we are testing the average tensile strength of a batch of bolts to see if it meets or exceeds minimum specified. It is very expensive to adjust the machine if the manufactured strength of these bolts does not meet standard. What level of α would we use if these bolts are used to secure lids on trash cans? If these bolts are used on the space shuttle?

21 3A.21 n The null would be that the bolts meet standard n The alternative would be that they don’t meet standard. n A Type I error would occur if they meet standard and we conclude they don’t. Then we’d go through an expensive machine adjustment process that really wasn’t necessary. n If these bolts are used on trash cans, we’d want a small alpha—reduce the risk of a Type I error! n If these bolts are used on the Shuttle, we’d want a small Beta—which we would get with a larger alpha!

22 3A.22 Hypothesis Testing n Hypothesis testing involves five major steps: 1.Formulating two opposing hypotheses 2.Setting α (and thereby defining the regions of rejection) 3.Collecting sample information 4.Using sample data to compute a test statistic 5.Make a decision based upon whether the test statistic is in a tail of rejection (defined by α )

23 3A.23 Computing a Test Statistic n Using z or t values n Remember that we can convert any normally distributed random variable to z or t values.

24 3A.24 Two-tailed vs. One-tailed Hypothesis Test

25 3A.25 Example: Large sample n Ever-Glo advertises that their 100 Watt bulbs last an average of 400 hours. You wish to challenge that claim. You test the lifetimes of 100 bulbs and get a sample mean of 411 hours and a sample standard deviation of 42.5 hours. What conclusion would you reach using a level of significance of.05? n Use the five step process to work this problem.

26 3A.26 Five step process: 1. H O : μ = 400.0 H A : μ  400.0 (Note: A two-tailed test) 2. α =.05 (which defines the rejection region as ± 1.96) 3.n = 100 X = 411.0 s = 42.5 5. 2.59 > 1.96. Therefore we are in the right tail. We reject the null! 4.

27 3A.27 “ Alpha” (  ) defines tails of rejection.025 μ o = 400 +1.96-1.96 X 2.59

28 3A.28 “ Alpha” (  ) defines tails of rejection.025 μ o = 400 +1.96-1.96 X 2.59 How do we interpret the 2.59?

29 3A.29 “ Alpha” (  ) defines tails of rejection.025 μ o = 400 +1.96-1.96 X 2.59 For a normal distribution with a mean of 400 and a standard deviation of 4.25, 411 is 2.59 standard deviations to the right.

30 3A.30 “ Alpha” (  ) defines tails of rejection.025 μ o = 400 +1.96-1.96 X 2.59 For a normal distribution with a mean of 400 and a standard deviation of 4.25, 411 is 2.59 standard deviations to the right. Our decision rule: Reject the Null for any test statistic beyond ± 1.96 standard deviations.

31 3A.31 “ Alpha” (  ) defines tails of rejection.025 μ o = 400 +1.96-1.96 X 2.59 We are at risk for committing what kind of error? What is that risk?

32 3A.32 Another Large Sample Problem: n Honda claims its Pilot SUV obtains an average of at least 22 mpg for highway driving. I’m skeptical. I had an independent research firm drive 50 identical models of the Pilot under identical and controlled highway conditions. The sample mean was 20.9 mpg. The sample standard deviation was 5.1 mpg. What conclusion can be drawn at the.05 level of significance?

33 3A.33 Is Honda likely telling the truth? H O : μ ≥ 22.0 H A : μ < 22.0 (Note: A left-tailed test) The tail of rejection (the Z value that puts.05 in the left tail) is defined by –1.645. The test statistic is

34 3A.34 A left-tailed test:.05 μ o = 22 -1.645 X -1.525 We fail to reject the null. These results are not statistically significant. We may, however, be committing a ________error.

35 3A.35 A left-tailed test:.05 μ o = 22 -1.645 X -1.525 We fail to reject the null. These results are not statistically significant. We may, however, be committing a TYPE II error.

36 3A.36 Try Problem Set 3

37 3A.37 Another Large Sample Problem: n A venture capitalist believes that funding the development of a new remote control lawn mower will be a profitable investment if the average price paid for a standard, self-propelled lawn mower is more than $450.00. A random sample of 46 people who recently purchased high-end mowers was obtained. The mean was $465.70. The standard deviation was $76.00. n Based upon a significance level of.05 should the venture capitalist invest in developing the new lawn mower? (Use MINITAB)

38 3A.38 Small sample problems n Remember for small samples, the parent population must be normally distributed. n Use the t statistic instead of Z

39 3A.39 Small sample problems n Remember for small samples, the parent population must be normally distributed. n Use the t statistic instead of Z Go into the t table at n-l degrees of freedom to get the tail of rejection. (Split the alpha for a two- tailed test.)

40 3A.40 Small sample problems H O : μ ≥ 30.0 H A : μ < 30.0 Suppose n= 14 and the “t” computed from sample data is -1.84. What is the tail of rejection at.05 according to the t table and what is your decision? -1.771 defines the tail of rejection. We are in the tail. We would reject the null.

41 3A.41 Small sample problems H O : μ = 32 inches H A : μ  32 inches Suppose n = 26 and the “t” computed from sample data is 1.44. What is the tail of rejection at.05 according to the t table and what is your decision? +2.060 defines the tail of rejection. We are not in the tail. We would fail to reject the null.

42 3A.42 The p-Value concept n S’pose, in a two-tailed test of hypotheses, I obtained a test statistic (Z value) of 1.85. n Would I reject the Null at a.10 level of significance? The tail of rejection would be defined by 1.645. The Z value of 1.85 would be in a tail of rejection. n Would I reject the Null at a.05 level of significance? The tail of rejection would now be defined by 1.96. I would not be in a tail of rejection with a Z=1.85. n Would I reject the Null at a.0644 level of significance? The tail of rejection would now be defined by 1.85—the Z value!

43 3A.43 The p-Value concept n S’pose, in a two-tailed test of hypothesis, I obtained a test statistic (Z value) of 1.85. n Would I reject the Null at a.10 level of significance? The tail of rejection would be defined by 1.645. The Z value of 1.85 would be in a tail of rejection. n Would I reject the Null at a.05 level of significance? The tail of rejection would now be defined by 1.96. I would not be in a tail of rejection with a Z=1.85. n Would I reject the Null at a.0644 level of significance? The tail of rejection would now be defined by 1.85—the Z value!

44 3A.44 The p-value is the level of alpha at which the decision changes.025 +1.96-1.96 X 1.645 1.85 If Z = 1.85, we would reject the null at.10. We would not reject the null at.05. The decision changes at an alpha of.0644 As alpha gets smaller and smaller, it becomes harder and harder to reject the null. What is the lowest level of alpha at which we can still reject the null in this example?

45 3A.45 The p-value is the level of alpha at which the decision changes.025 +1.96-1.96 X 1.645 1.85 If Z = 1.85, we would reject the null at.10. We would not reject the null at.05. The decision changes at an alpha of.0644 As alpha gets smaller and smaller, it becomes harder and harder to reject the null. What is the lowest level of alpha at which we can still reject the null in this example?.0644 and that is the p-value!

46 3A.46p-Value n The p-value is the value of  at which the hypothesis test procedure changes conclusions based on a given set of data. n P-value conceptualization I: It is the lowest value of  for which you will be able to reject H o. n P-value conceptualization II: It is the probability of getting the sample result that you obtained or something more extreme, given that the null is true.

47 3A.47 Procedure for Finding the p-Value Obtain the area in the tail(s) at Z n For a two-tailed test (H a :    o ) p-value = 2 * (area outside Z) n For a right-tailed test (H a :  >  o ) p-value = area to the right of Z n For a left tailed test (H a :  <  o ) p-value = area to the left of Z

48 3A.48 Examples of computing p-value: 1. For H O : μ ≥ 22.0 H A : μ < 22.0 and Z = -1.5 what is the p-value?.0668 2. For H O : μ ≤ 6.0 H A : μ > 6.0 And Z = 4.9 what is the p-value? 0.00

49 3A.49 Examples of computing p-value: 3. For H O : μ = 2000 H A : μ  2000 and Z = 1.85 what is the p-value?.0644 4. For H O : μ ≤ 34,000 H A : μ > 34,000 And Z =.84 what is the p-value? 0.20 What does the.20 mean in plain English?

50 3A.50 Work this problem: n Consumer Reports magazine claims that the good negotiator can get at least 7% off the sticker price of a new car. I send out skilled negotiators to 36 randomly selected new car dealers to test this claim. I find they average a negotiated discount of 6.32% with a sample standard deviation of 1.7%. n What are the hypotheses? n What is the lowest level of alpha at which I can reject the null?

51 3A.51Solution n H O : μ ≥ 7% H A : μ < 7% The p-Value is the area to the left of –2.40, which is.0082

52 3A.52 Interpreting the p-Value n Classical Approach n reject H o if p-value ≤  n fail to reject H o if p-value >  n Fisher’s rule of thumb n Strongly reject H o if p-value is small (p <.01) n Tend to reject H o if p is between.01 and.05 n Tend not to reject H o if p is between.05 and.1 n Strongly fail to reject H o if p is large (p >.1)

53 3A.53 Looking at the p-value in MINITAB n A federal bankruptcy court wants to see if statistical evidence indicates the average salary of store managers at K-Mart is more than $100,000 per year. The analyst generally uses a 5% level of significance but also wants to look at the p-value. n Using MINITAB, assess these hypotheses : H O : μ ≤ 100,000 H A : μ > 100,000 What Z value defines the tail of rejection for.05 in the right tail? +1.645But this is a t problem! Tail of rejection is +1.753

54 3A.54 The K-mart problem n The sample mean salary is $102,159. Certainly the point estimate is greater than $100,000. n But this sample result is not statistically significant. n The probability of getting this sample result or something more extreme if the true mean salary is $100,000 or less is 22% (the p-value in this particular problem). n The lowest level of alpha at which we could reject the null is.22 n If we use an alpha of.05, we would fail to reject the null in this problem. n We don’t “accept” that K-Mart managers on average are paid $100,000 or less but this sample evidence is not strong enough to reject that proposition.

55 3A.55 Practice comparing  to the p-Value n For a right tailed test, alpha is.05 and the p-value is.11 n Fail to reject null n For a two-tailed test, alpha is.10 and the p-value is.04 n Reject null and accept alternative n For a left-tailed test, alpha is.01 and the p-value is.008 n Reject the null and accept alternative n H O : μ ≤ 150 H A : μ > 150 and the Z value has been calculated as 2.2 What is the p-value and what is your decision on these hypotheses? The p-value is.0139. If my company uses an alpha of.05, I would reject the null. If my company uses an alpha of.01, I would not reject the null. According to Fisher’s rule of thumb, I would tend to reject the null but not strongly.

56 3A.56 Hypothesis tests on proportions n Example: H O :  ≤.40 H A :  >.40 S’pose n = 500 and x = 229; hence P =.458 P-Value is.0041. We can assuredly reject the null.

57 3A.57Summary n Hypothesis testing is a systematic approach to assessing beliefs. n That which requires action, or that which we are trying to demonstrate goes into the Alternative Hypothesis. n Hypotheses can be one-tailed or two-tailed n We can assess sample information and the associated test statistic with one of two approaches: A pre-determined level for  or by a p-Value. n Judgments on hypotheses can lead to either a Type I or a Type II error.


Download ppt "S TATISTICS Part IIIA. Hypothesis Testing. 3A.2 Hypothesis Testing n Similar techniques to last block of study on estimation procedures. There, we made."

Similar presentations


Ads by Google