Statistical Hypothesis Testing Review

Statistical Hypothesis Testing Review
A statistical hypothesis is an assertion concerning one or more populations. In statistics, a hypothesis test is conducted on a set of two mutually exclusive statements: H0 : null hypothesis H1 : alternate hypothesis Example H0 : μ = 17 H1 : μ ≠ 17 We sometimes refer to the null hypothesis as the “equals” hypothesis. Draw critical region, critical value. EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Potential errors in decision-making
α Probability of committing a Type I error Probability of rejecting the null hypothesis given that the null hypothesis is true P (reject H0 | H0 is true) β Probability of committing a Type II error Power of the test = 1 - β (probability of rejecting the null hypothesis given that the alternate is true.) Power = P (reject H0 | H1 is true) Power of the test = 1- β EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Hypothesis Testing – Approach 1
Approach 1 - Fixed probability of Type 1 error. State the null and alternative hypotheses. Choose a fixed significance level α. Specify the appropriate test statistic and establish the critical region based on α. Draw a graphic representation. Calculate the value of the test statistic based on the sample data. Make a decision to reject H0 or fail to reject H0, based on the location of the test statistic. Make an engineering or scientific conclusion. recall our question about the amount of coffee in the cup … 1. H0 : μ = 8 oz. H1 : μ < 8 oz. 2. α = 0.05 3. zα = 5. if zcalc < , reject H0 if zcalc > , fail to reject H0 6. e.g., coffee in the cup is significantly less than 8 oz. or coffee in the cup is not significantly less than 8 oz. EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Hypothesis Testing – Approach 2
Approach 2 - Significance testing based on the calculated P-value State the null and alternative hypotheses. Choose an appropriate test statistic. Calculate value of test statistic and determine P-value. Draw a graphic representation. Make a decision to reject H0 or fail to reject H0, based on the P-value. Make an engineering or scientific conclusion. recall our question about the amount of coffee in the cup … 1. H0 : μ = 8 oz. H1 : μ < 8 oz. 2. if variance known or n large, z-test (assume z in this case) 3. from zcalc, determine p-value from table A.3 or as given by statistical software packages. 4. P = 0, H0 rejected / not plausible (e.g., coffee in the cup is significantly less than 8 oz.) P = 1, H0 is not rejected (coffee in the cup is not significantly less than 8 oz.) Note: Approach 1 is the classical method. Approach 2 is gaining acceptance, partly because of the increasing availability of statistical software packages. The conclusion based on a P-value requires judgment. The smaller the P-value, the less plausible is the null hypothesis. p = ↓ P-value 0.25 0.50 0.75 1.00 P-value EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Example: Single Sample Test of the Mean P-value Approach
A sample of 20 cars driven under varying highway conditions achieved fuel efficiencies as follows: Sample mean x = mpg Sample std dev s = mpg Test the hypothesis that the population mean equals 35.0 mpg vs. μ < 35. Step 1: State the hypotheses. H0: μ = 35 H1: μ < 35 Step 2: Determine the appropriate test statistic. σ unknown, n = 20 Therefore, use t distribution H0 : μ = 35 H1 : μ < 35 n = 20 use t-distribution EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Single Sample Example (cont.)
Approach 2: = Find probability from chart or use Excel’s tdist function. P(x ≤ ) = TDIST (1.118, 19, 1) = p = 0.14 0______________ Decision: Fail to reject null hypothesis Conclusion: The mean is not significantly less than 35 mpg. t = =( )/(2.915/(SQRT(20))) P = =TDIST(1.118,19,1) draw the graphs. NOTE: if we look at table A.4, pg. 672, the α value associated to (by symmetry) falls between 0.15 and 0.10. 13.86% of the area under the curve lies to the left of t = Judge that H0 is plausible (fail to reject) and conclude that μ does not differ significantly from 35. EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Example (concl.) Approach 1: Predetermined significance level (alpha)
Step 1: Use same hypotheses. Step 2: Let’s set alpha at 0.05. Step 3: Determine the critical value of t that separates the “reject H0 region” from the “do not reject H0 region”. t, n-1 = t0.05,19 = 1.729 Since H1 specifies “< ” we declare tcrit = Step 4: Using the equation, we calculate tcalc = Step 5: Decision Fail to reject H0 Step 6: Conclusion: The mean is not significantly less than 35 mpg. (one-sided or one-tailed test) t0.05,19 = 1.729, tcrit = t = draw the picture … EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Your turn … same data, different hypotheses
A sample of 20 cars driven under varying highway conditions achieved fuel efficiencies as follows: Sample mean = mpg Sample std dev (s) = mpg Test the hypothesis that the population mean equals 35.0 mpg vs. μ ≠ 35 at an α level of Be sure to draw the picture. Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 (Conclusion will be different.) t0.025,19 = 2.093 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Two-Sample Hypothesis Testing
A professor has designed an experiment to test the effect of reading the textbook before attempting to complete a homework assignment. Four students who read the textbook before attempting the homework recorded the following times (in hours) to complete the assignment: 3.1, 2.8, 0.5, hours Five students who did not read the textbook before attempting the homework recorded the following times to complete the assignment: 0.9, 1.4, 2.1, , hours EGR Ch10 Lec2and3 9th edition rev3

Two-Sample Hypothesis Testing
Define the difference in the two means as: μ1 - μ2 = d0 where d0 is the actual value of the hypothesized difference What are the Hypotheses? H0: _______________ H1: _______________ or NOTE: d0 is often 0 (there is, statistically speaking, no difference in the means) H0: μ1 - μ2 = 0 H1: μ1 - μ2 < 0 (note: compare lower to higher for lower-tail test) H1: μ1 - μ2 ≠ 0 H1: μ1 – μ2 > 0 (note: compare higher to lower for upper-tail test) EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Our Example Using Excel
Reading: n1 = 4 mean x1 = s12 = 1.363 No reading: n2 = 5 mean x2 = s22 = 3.883 If we have reason to believe the population variances are “equal”, we can conduct a t- test assuming equal variances in Minitab or Excel. t-Test: Two-Sample Assuming Equal Variances Read DoNotRead Mean 2.075 2.860 Variance 1.3625 3.883 Observations 4 5 Pooled Variance Hypothesized Mean Difference df 7 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail t-test sp2 = (3(1.363)+4(3.883))/(4+5-2) = , s = 1.674 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Your turn … Recall  Lower-tail test (μ1 - μ2 < 0)
“Fixed α” approach (“Approach 1”) at α = 0.05 level. “p-value” approach (“Approach 2”) Upper-tail test (μ2 – μ1 > 0) “Fixed α” approach at α = 0.05 level. “p-value” approach Two-tailed test (μ1 - μ2 ≠ 0) Recall  Note that we have to compare higher – lower mean to conduct an upper-tail test EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Our Example – Hand Calculation
Reading: n1 = 4 mean x1 = s12 = 1.363 No reading: n2 = 5 mean x2 = s22 = 3.883 To conduct the test by hand, we must calculate sp2 . = sp = 1.674 and = ???? t-test sp2 = (3(1.363)+4(3.883))/(4+5-2) = , s = 1.674 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Lower-tail test (μ1 - μ2 < 0) Why?
Draw the picture: Approach 1: df = 7, t0.05,7 =  tcrit = Calculation: tcalc = (( )-0)/(1.674*sqrt(1/4 + 1/5)) = -0.70 Graphic: Decision: Conclusion: tcalc = (( )-0)/(1.674*sqrt(1/4 + 1/5)) = -0.70 Approach 1: df = 7, t0.05,7 =  tcrit = Approach 2: =TDIST(0.7,7,1) = Decision: fail to reject H0 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Upper-tail test (μ2 – μ1 > 0) Conclusions
The data do not support the hypothesis that the mean time to complete homework is less for students who read the textbook. or There is no statistically significant difference in the time required to complete the homework for the people who read the text ahead of time vs those who did not. The data do not support the hypothesis that the mean completion time is less for readers than for non-readers. tcalc = (( )-0)/(1.674*sqrt(1/4 – 1/5)) = 0.70 Approach 1: df = 7, t0.5,7 =  tcrit = 1.895 Approach 2: =TDIST(0.7,7,1) = Decision: fail to reject H0 Conclusion: the data do not support the hypothesis that the mean time to complete homework is more for students who do not read the textbook EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Our Example Using Excel
Reading: n1 = 4 mean x1 = s12 = 1.363 No reading: n2 = 5 mean x2 = s22 = 3.883 What if we do not have reason to believe the population variances are “equal”? We can conduct a t- test assuming unequal variances in Minitab or Excel. t-Test: Two-Sample Assuming Equal Variances Read DoNotRead Mean 2.075 2.860 Variance 1.3625 3.883 Observations 4 5 Pooled Variance Hypothesized Mean Difference df 7 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail t-Test: Two-Sample Assuming Unequal Variances Read DoNotRead Mean 2.075 2.86 Variance 1.3625 3.883 Observations 4 5 Hypothesized Mean Difference df 7 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail t-test sp2 = (3(1.363)+4(3.883))/(4+5-2) = , s = 1.674 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Another Example: Low Carb Meals
Suppose we want to test the difference in carbohydrate content between two “low-carb” meals. Random samples of the two meals are tested in the lab and the carbohydrate content per serving (in grams) is recorded, with the following results: n1 = 15 x1 = s12 = 11 n2 = 10 x2 = s22 = 23 tcalc = ______________________ ν = ______________ (using equation in table 10.3) tcalc = =( )/(SQRT(11/15+23/10)) = df from equation in table 10.2, pg. 313 ν = ≈ 15 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Example (cont.) What are our options for hypotheses?
H0: μ1 - μ2 = or H0: μ1 - μ2 = 0 H1: μ1 - μ2 > 0 H1: μ1 - μ2 ≠ 0 At an α level of 0.05, One-tailed test, t0.05, 15 = 1.753 Two-tailed test, t0.025, 15 = 2.131 How are our conclusions affected? Our data don’t support a conclusion that the mean carb content of the two meals are different at an alpha level of (What is H1 ?) Our data do support a conclusion that Meal 1 has more average carbs than Meal 2 at an alpha level of (What is H1 ?) H0: μ1 - μ2 = 0 vs H0: μ1 - μ2 = 0 H1: μ1 - μ2 > 0 H1: μ1 - μ2 ≠ 0 t0.05, 15 = 1.753 t0.025, 15 = 2.131 Our data don’t support a conclusion that the two meals are different at an alpha level of .05 Our data do support a conclusion that meal 1 has more carbs than meal 2 at an alpha level of .05 note, 1-sided p-value =TDIST(1.895,15,1) = 2-sided p-value =TDIST(1.895,15,2) = EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Special Case: Paired Sample T-Test
Which designs are paired-sample? Car Radial Belted 1 ** ** Radial, Belted tires 2 ** ** placed on each car. 3 ** ** 4 ** ** Person Pre Post 1 ** ** Pre- and post-test 2 ** ** administered to each 3 ** ** person. Student Test1 Test2 1 ** ** 4 scores from test 1, 2 ** ** 4 scores from test 2. paired-sample paired sample maybe – if we have information that the test1 and test2 scores can be matched to a particular individual for every subject in the study, it is a paired-sample experiment; otherwise it is a 2-sample experiment. EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Sheer Strength Example*
An article in the Journal of Strain Analysis compares several methods for predicting the shear strength of steel plate girders. Data for two of these methods, when applied to nine specific girders, are shown in the table on the next slide. We would like to determine if there is any difference, on average, between the two methods. Procedure: We will conduct a paired-sample t-test at the 0.05 significance level to determine if there is a difference between the two methods. * adapted from Montgomery & Runger, Applied Statistics and Probability for Engineers. difference scores, d 0.119 0.159 etc. EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Sheer Strength Example Data
Girder Karlsruhe Method Lehigh Method Difference (d) 1 1.186 1.061 0.125 2 1.151 0.992 0.159 3 1.322 1.063 0.259 4 1.339 1.062 0.277 5 1.200 1.065 0.135 6 1.402 1.178 0.224 7 1.365 1.037 0.328 8 1.537 1.086 0.451 9 1.559 1.052 0.507 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Sheer Strength Example Calculations
Hypotheses: H0: μD = 0 H1: μD ≠ t0.025,8 = Why 8? Calculation of difference scores (d), mean and standard deviation, and tcalc … d = sd = tcalc = ( d – d0 ) = ( ) = sd / sqrt(n) ( / 3) t0.025,8 = 2.306 difference scores – previous page tcalc = (dbar-d0) sd/sqrt(n) EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

What does this mean? Draw the graphic: Decision: Conclusion:
Graphic: t-test with tcrit = (lower boundary) and 2.306(upper boundary) and tcalc = 6.05 Decision: reject H0 Conclusion: The two methods produce different results, on average. Since we subtracted Lehigh scores from Karlsruhe scores, and the resulting difference scores were positive, we have evidence that the Karlsruhe method yields larger strength predictions. EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Goodness-of-Fit Tests
Procedures for confirming or refuting hypotheses about the distributions of random variables. Hypotheses: H0: The population follows a particular distribution. H1: The population does not follow the distribution. Examples: H0: The data come from a normal distribution. H1: The data do not come from a normal distribution. EGR Ch10 Lec2and3 9th edition rev3

Goodness of Fit Tests: Basic Method
Test statistic is χ2 Draw the picture Determine the critical value χ2 with parameters α, ν = k – 1 Calculate χ2 from the sample Compare χ2calc to χ2crit Make a decision about H0 State your conclusion draw χ2 graph k = number of “cells” (note: some texts use k-1-h where h is the number of parameters in the distribution being tested – e.g., 1 for Poisson, 2 for normal) Table A.5, pg show calc and crit on the drawing EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Tests of Independence Salaried 160 140 40 340 Hourly 60 200 100 500
Example: 500 employees were surveyed with respect to pension plan preferences. Hypotheses H0: Worker Type and Pension Plan are independent. H1: Worker Type and Pension Plan are not independent. Develop a Contingency Table showing the observed values for the 500 people surveyed. Worker Type Pension Plan Total #1 #2 #3 Salaried 160 140 40 340 Hourly 60 200 100 500 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Calculation of Expected Values
Worker Type Pension Plan Total #1 #2 #3 Salaried 160 140 40 340 Hourly 60 200 100 500 2. Calculate expected probabilities P(#1 ∩ S) = P(#1)*P(S) = (200/500)*(340/500)=0.272 E(#1 ∩ S) = * 500 = 136 P(#1 ∩ S) =P(#1)*P(S) = (200/500)*(340/500)= E(#1 ∩ S) = 0.272*500 = 136 P(#1 ∩ H) = P(#1)*P(H) = (200/500)*(160/500)= E(#1 ∩ H) = 64 #1 #2 #3 S (exp) H(exp) #1 #2 #3 S (exp.) 136 ? 68 H (exp.) 64 32 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Calculate the Sample-based Statistic
Calculation of the sample-based statistic = ( )^2/(136) + ( )^2/(136) + … (60-32)^2/(32) = 49.63 ( )^2/136 + ( )^2/136 + (40-68)^2/68 + (40-64)^2/64 + (60-64)^2/64 + (60-32)^2/32 = 49.63 EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

The Chi-Squared Test of Independence
5. Compare to the critical statistic, χ2α, v where v = (r – 1)(c – 1) Note: v is the symbol for degrees of freedom For our example, suppose α = 0.01 χ2 0.01,2 = ___________ χ2 calc = ___________ Decision: Conclusion: 2013 χ20.01,2__ = (from Table A.5, pp 740) χ2calc> χ2crit so reject the null hypothesis and conclude that worker and plan are not independent EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

The Chi-Squared Test in Minitab 15
Chi-Square Test: pp1, pp2, pp3 Expected counts are printed below observed counts Chi-Square contributions are printed below expected counts pp pp pp Total Total Test statistic: Chi-Sq calc = , DF = 2, P-Value = Reject Ho. Conclude that worker and plan are not independent. 2013 χ20.01,2__ = (from Table A.5, pp 740) χ2calc> χ2crit so reject the null hypothesis and conclude that worker and plan are not independent EGR Ch10 Lec2and3 9th edition rev3 EGR 252 F06 Ch. 10 8th edition

Statistical Hypothesis Testing Review

Similar presentations

Presentation on theme: "Statistical Hypothesis Testing Review"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Hypothesis Testing Review

Similar presentations

Presentation on theme: "Statistical Hypothesis Testing Review"— Presentation transcript:

Similar presentations

About project

Feedback