Overview This is the other part of inferential statistics, hypothesis testing Hypothesis testing and estimation are two different approaches to two similar.

Slides:

Advertisements

Similar presentations

Regarding a Parameter – Single Mean & Single Proportion

Advertisements

Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.

Inference Sampling distributions Hypothesis testing.

Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.

1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.

Chapter 10 Section 2 Hypothesis Tests for a Population Mean

Significance Testing Chapter 13 Victor Katch Kinesiology.

EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.

Hypothesis Testing After 2 hours of frustration trying to fill out an IRS form, you are skeptical about the IRS claim that the form takes 15 minutes on.

1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.

Inferences On Two Samples

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.

8-2 Basics of Hypothesis Testing

Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.

Inferences About Process Quality

Ch. 9 Fundamental of Hypothesis Testing

Chapter 8 Introduction to Hypothesis Testing

Statistics for Managers Using Microsoft® Excel 5th Edition

Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.

Hypothesis Testing:.

Chapter 10 Hypothesis Testing

Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.

Lecture Slides Elementary Statistics Twelfth Edition

Overview Definition Hypothesis

Confidence Intervals and Hypothesis Testing - II

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,

Statistical inference: confidence intervals and hypothesis testing.

© 2003 Prentice-Hall, Inc.Chap 9-1 Fundamentals of Hypothesis Testing: One-Sample Tests IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION.

Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.

Fundamentals of Hypothesis Testing: One-Sample Tests

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.

More About Significance Tests

Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests

Overview Basics of Hypothesis Testing

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 9 Section 1 – Slide 1 of 39 Chapter 9 Section 1 The Logic in Constructing Confidence Intervals.

Chapter 10 Hypothesis Testing

Lesson Significance Tests: The Basics. Vocabulary Hypothesis – a statement or claim regarding a characteristic of one or more populations Hypothesis.

Testing of Hypothesis Fundamentals of Hypothesis.

Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.

Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.

Lesson The Language of Hypothesis Testing.

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 10 Section 1 – Slide 1 of 34 Chapter 10 Section 1 The Language of Hypothesis Testing.

Hypothesis and Test Procedures A statistical test of hypothesis consist of : 1. The Null hypothesis, 2. The Alternative hypothesis, 3. The test statistic.

Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.

1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.

1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 

Copyright © 2010, 2007, 2004 Pearson Education, Inc Section 8-2 Basics of Hypothesis Testing.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.

Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.

Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.

Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry

Lesson Inference about Two Means - Dependent Samples.

Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.

Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.

Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.

© 2004 Prentice-Hall, Inc.Chap 9-1 Basic Business Statistics (9 th Edition) Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.

© Copyright McGraw-Hill 2004

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.

Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.

1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.

Created by Erin Hodgess, Houston, Texas Section 7-1 & 7-2 Overview and Basics of Hypothesis Testing.

Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.

Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.

© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.

© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.

Hypothesis Tests for a Population Mean in Practice

Chapter 9 Hypothesis Testing: Single Population

Presentation transcript:

Overview This is the other part of inferential statistics, hypothesis testing Hypothesis testing and estimation are two different approaches to two similar problems  Estimation is the process of using sample data to estimate the value of a population parameter  Hypothesis testing is the process of using sample data to test a claim about the value of a population parameter

What is Hypothesis Testing? The environment of our problem is that we want to test whether a particular claim is believable, or not. Hypothesis testing involves two steps  Step 1 – to state what we think is true  Step 2 – to quantify how confident we are in our claim

My Plan Finding the appropriate Hypothesis to Test Complete Hypothesis Testing Procedure Result of the Hypothesis Test

An example of what we want to quantify  A car manufacturer claims that a certain model of car achieves 29 miles per gallon  We test some number of cars  We calculate the sample mean … it is 27  Is 27 miles per gallon consistent with the manufacturer’s claim? How confident are we that the manufacturer has significantly overstated the miles per gallon achievable?

An example of what we want to quantify ● How confident are we that the gas economy is definitely less than 29 miles per gallon? ● We would like to make either a statement “We’re pretty sure that the mileage is less than 29 mpg” or “It’s believable that the mileage is equal to 29 mpg”

Definition ● A hypothesis test for an unknown parameter is a test of a specific claim  Compare this to a confidence interval which gives an interval of numbers, not a “believe it” or “don’t believe it” answer ● The level of significance represents the confidence we have in our conclusion

Null Hypothesis How do we state our claim? Our claim  Is the statement to be tested  Is called the null hypothesis  Is written as H 0 (and is read as “H-naught”)

A Useful Analogy ● In the judicial system, the defendant “is innocent until proven guilty”  Thus the defendant is presumed to be innocent  The null hypothesis is that the defendant is innocent  H 0 : the defendant is innocent

Alternative Hypothesis How do we state our counter-claim? Our counter-claim  Is the opposite of the statement to be tested  Is called the alternative hypothesis  Is written as H 1 (and is read as “H-one”)

● If the defendant is not innocent, then  The defendant is guilty  The alternative hypothesis is that the defendant is guilty  H 1 : the defendant is guilty ● The summary of the set-up  H 0 : the defendant is innocent  H 1 : the defendant is guilty

● There are different types of null hypothesis -alternative hypothesis pairs, depending on the claim and the counter-claim ● One type of H 0 / H 1 pair, called a two- tailed test, tests whether the parameter is either equal to, versus not equal to, some value  H 0 : parameter = some value  H 1 : parameter ≠ some value

● An example of a two-tailed test ● A bolt manufacturer claims that the diameter of the bolts average 10 mm  H 0 : Diameter = 10  H 1 : Diameter ≠ 10 ● An alternative hypothesis of “≠ 10” is appropriate since  A sample diameter that is too high is a problem  A sample diameter that is too low is also a problem ● Thus this is a two-tailed test

Another type of pair, called a left-tailed test, tests whether the parameter is either equal to, versus less than, some value  H 0 : parameter = some value  H 1 : parameter < some value

● An example of a left-tailed test ● A car manufacturer claims that the mpg of a certain model car is at least 29.0  H 0 : MPG = 29.0  H 1 : MPG < 29.0 ● An alternative hypothesis of “< 29” is appropriate since  A mpg that is too low is a problem  A mpg that is too high is not a problem ● Thus this is a left-tailed test

Another third type of pair, called a right- tailed test, tests whether the parameter is either equal to, versus greater than, some value  H 0 : parameter = some value  H 1 : parameter > some value

● An example of a right-tailed test ● A bolt manufacturer claims that the defective rate of their product is at most 1 part in 1,000  H 0 : Defect Rate =  H 1 : Defect Rate > ● An alternative hypothesis of “> 0.001” is appropriate since  A defect rate that is too low is not a problem  A defect rate that is too high is a problem ● Thus this is a right-tailed test

● A comparison of the three types of tests ● The null hypothesis  We believe that this is true ● The alternative hypothesis

● A manufacturer claims that there are at least two scoops of cranberries in each box of cereal ● What would be a problem?  The parameter to be tested is the number of scoops of cranberries in each box of cereal  If the sample mean is too low, that is a problem  If the sample mean is too high, that is not a problem ● This is a left-tailed test  The “bad case” is when there are too few

● A manufacturer claims that there are exactly 500 mg of a medication in each tablet ● What would be a problem?  The parameter to be tested is the amount of a medication in each tablet  If the sample mean is too low, that is a problem  If the sample mean is too high, that is a problem too ● This is a two-tailed test  A “bad case” is when there are too few  A “bad case” is also where there are too many

● A manufacturer claims that there are at most 8 grams of fat per serving ● What would be a problem?  The parameter to be tested is the number of grams of fat in each serving  If the sample mean is too low, that is not a problem  If the sample mean is too high, that is a problem ● This is a right-tailed test  The “bad case” is when there are too many

● There are two possible results for a hypothesis test ● If we believe that the null hypothesis could be true, this is called not rejecting the null hypothesis  Note that this is only “we believe … could be” ● If we are pretty sure that the null hypothesis is not true, so that the alternative hypothesis is true, this is called rejecting the null hypothesis  Note that this is “we are pretty sure that … is”

In comparing our conclusion (not reject or reject the null hypothesis) with reality, we could either be right or we could be wrong  When we reject (and state that the null hypothesis is false) but the null hypothesis is actually true  When we not reject (and state that the null hypothesis could be true) but the null hypothesis is actually false These would be undesirable errors

A summary of the errors is We see that there are four possibilities … in two of which we are correct and in two of which we are incorrect

● When we reject (and state that the null hypothesis is false) but the null hypothesis is actually true … this is called a Type I error ● When we do not reject (and state that the null hypothesis could be true) but the null hypothesis is actually false … this called a Type II error ● In general, Type I errors are considered the more serious of the two

● We can make use of our analogy for Type I and Type II errors in comparing it to a criminal trial ● In the judicial system, the defendant “is innocent until proven guilty”  Thus the defendant is presumed to be innocent  The null hypothesis is that the defendant is innocent  H 0 : the defendant is innocent

● If the defendant is not innocent, then  The defendant is guilty  The alternative hypothesis is that the defendant is guilty  H 1 : the defendant is guilty ● The summary of the set-up  H 0 : the defendant is innocent  H 1 : the defendant is guilty

● Our possible conclusions ● Reject the null hypothesis  Go with the alternative hypothesis  H 1 : the defendant is guilty  We vote “guilty” ● Do not reject the null hypothesis  Go with the null hypothesis  H 0 : the defendant is innocent  We vote “not guilty” (which is not the same as voting innocent!)

● A Type I error  Reject the null hypothesis  The null hypothesis was actually true  We voted “guilty” for an innocent defendant ● A Type II error  Do not reject the null hypothesis  The alternative hypothesis was actually true  We voted “not guilty” for a guilty defendant

● Which error do we try to control? ● Type I error (sending an innocent person to jail)  The evidence was “beyond reasonable doubt”  We must be pretty sure  Very bad! We want to minimize this type of error ● A Type II error (letting a guilty person go)  The evidence wasn’t “beyond a reasonable doubt”  We weren’t sure enough  If this happens … well … it’s not as bad as a Type I error (according to the law system)

“Innocent” versus “Not Guilty” This is an important concept Innocent is not the same as not guilty  Innocent – the person did not commit the crime  Not guilty – there is not enough evidence to convict … that the reality is unclear To not reject the null hypothesis – doesn’t mean that the null hypothesis is true – just that there isn’t enough evidence to reject

Summary so far… A hypothesis test tests whether a claim is believable or not, compared to the alternative We test the null hypothesis H 0 versus the alternative hypothesis H 1 If there is sufficient evidence to conclude that H 0 is false, we reject the null hypothesis If there is insufficient evidence to conclude that H 0 is false, we do not reject the null hypothesis

My Plan Finding the appropriate Hypothesis to Test Complete Hypothesis Testing Procedure Result of the Hypothesis Test

We have the outline of a hypothesis test, just not the detailed implementation What is the exact procedure to get to a do not reject / reject conclusion? How do we calculate Type I and Type II errors?

Our aim is to conduct an hypothesis test about a population parameter. Like:  A car manufacturer claims that a certain model of car achieves 29 miles per gallon  We test some number of cars  We calculate the sample mean … it is 27  Is 27 miles per gallon consistent with the manufacturer’s claim? How confident are we that the manufacturer has significantly overstated the miles per gallon achievable?

● STEP 1  We have a null hypothesis, that the actual mean is equal to a value μ 0  We have an alternative hypothesis ● STEP 2  A criterion that quantifies “unlikely”  That the actual mean is unlikely to be equal to μ 0  A criterion that determines what would be a do not reject and what would be a reject

● STEP 3  We run an experiment  We collect the data  We calculate the sample mean ● MID-STEP : Our Assumptions  That the sample is a simple random sample  That the sample mean has a normal distribution

● We compare the sample mean x to the hypothesized population mean μ 0 ● For two-tailed tests ● α = 0.05 Critical Value (1.96) REJECTION REGION Shaded regions are called REJECTION REGION

 The least likely 5% is the lowest 2.5% and highest 2.5% (below –1.96 and above standard deviations) … –1.96 and are the critical values  The region outside this is the rejection region

● For left-tailed tests  The least likely 5% is the lowest 5% (below – standard deviations) … –1.645 is the critical value  The region less than this is the rejection region

● For right-tailed tests  The least likely 5% is the highest 5% (above standard deviations) … is the critical value  The region greater than this is the rejection region

The difference is We standardize This is called the test statistic If the test statistic is in the rejection region – we reject

● An example of a two-tailed test ● A bolt manufacturer claims that the diameter of the bolts average 10.0 mm  H 0 : Diameter = 10.0  H 1 : Diameter ≠ 10.0 ● We take a sample of size 40  (Somehow) We know that the standard deviation of the population is 0.3 mm  The sample mean is mm  We’ll use a level of significance α = 0.05

● Do we reject the null hypothesis?  is 0.12 higher than 10.0  The standard error is (0.3 / √ 40) =  The test statistic is 2.53  The critical normal value, for α /2 = 0.025, is 1.96  2.53 is more than 1.96 ● Our conclusion  We reject the null hypothesis  We have sufficient evidence that the population mean diameter is not 10.0

● An example of a left-tailed test ● A car manufacturer claims that the mpg of a certain model car is at least 29.0  H 0 : MPG = 29.0  H 1 : MPG < 29.0 ● We take a sample of size 40  (Somehow) We know that the standard deviation of the population is 0.5  The sample mean mpg is  We’ll use a level of significance α = 0.05

● Do we reject the null hypothesis?  is 0.11 lower than 29.0  The standard error is (0.5 / √ 40) =  The test statistic is  is greater than , the left-tailed critical value for α = 0.05 ● Our conclusion  We do not reject the null hypothesis  We have insufficient evidence that the population mean mpg is less than 29.0

● An example of a right-tailed test ● A bolt manufacturer claims that the defective rate of their product is at most 1.70 per 1,000  H 0 : Defect Rate = 1.70  H 1 : Defect Rate > 1.70 ● We take a sample of size 40  (Somehow) We know that the standard deviation of the population is.06  The sample defect rate is 1.78  We’ll use a level of significance α = 0.05

● Do we reject the null hypothesis?  1.78 is 0.08 higher than 1.70  The standard error is (0.06 / √ 40) =  The test statistic is 8.43  8.43 is more than 1.645, the right-tailed critical value for α = 0.05 ● Our conclusion  We reject the null hypothesis  We have sufficient evidence that the population mean rate is more than 1.70

● Two-tailed test  The critical values are z α /2 and –z α /2  The rejection region is {less than –z α /2 } and {greater than z 1- α /2 } ● Left-tailed test  The critical value is –z α  The rejection region is {less than –z α } ● Right-tailed test  The critical value is z α  The rejection region is {greater than z α }

The difference is We standardize This is called the test statistic If the test statistic is in the rejection region – we reject

The general picture for a level of significance α

The P-value is the probability of observing a sample mean that is as or more extreme than the observed The probability is calculated assuming that the null hypothesis is true We use the P-value to quantify how unlikely the sample mean is

● Just like in the classical approach, we calculate the test statistic ● We then calculate the P-value, the probability that the sample mean would be this, or more extreme, if the null hypothesis was true ● The two-tailed, left-tailed, and right-tailed calculations are slightly different

● For the two-tailed test, the “unlikely” region are values that are too high and too low ● Small P-values corresponds to situations where it is unlikely to be this far away

● For the left-tailed test, the “unlikely” region are values that are too low ● Small P-values corresponds to situations where it is unlikely to be this low

● For the right-tailed test, the “unlikely” region are values that are too high ● Small P-values corresponds to situations where it is unlikely to be this high

For all three models (two-tailed, left- tailed, right-tailed)  The larger P-values mean that the difference is not relatively large … that it’s not an unlikely event  The smaller P-values mean that the difference is relatively large … that it’s an unlikely event

● Larger P-values  A P-value of 0.30, for example, means that this value, or more extreme, could happen 30% of the time  30% of the time is not unusual ● Smaller P-values  A P-value of 0.01, for example, means that this value, or more extreme, could happen only 1% of the time  1% of the time is unusual

The decision rule is For a significance level α  Do not reject the null hypothesis if the P-value is greater than α  Reject the null hypothesis if the P-value is less than α For example, if α = 0.05  A P-value of 0.30 is likely enough, compared to a criterion of 0.05  A P-value of 0.01 is unlikely, compared to a criterion of 0.05

● An example of a two-tailed test ● A bolt manufacturer claims that the diameter of the bolts average 10.0 mm  H 0 : Diameter = 10.0  H 1 : Diameter ≠ 10.0 ● We take a sample of size 40  (Somehow) We know that the standard deviation of the population is 0.3 mm  The sample mean is mm  We’ll use a level of significance α = 0.05

● Do we reject the null hypothesis?  is 0.12 higher than 10.0  The standard error is (0.3 / √ 40) =  The test statistic is 2.53  The 2-sided P-value of 2.53 is 0.01 < 0.05 = α ● Our conclusion  We reject the null hypothesis  We have sufficient evidence that the population mean diameter is not 10.0

● An example of a left-tailed test ● A car manufacturer claims that the mpg of a certain model car is at least 29.0  H 0 : MPG = 29.0  H 1 : MPG < 29.0 ● We take a sample of size 40  (Somehow) We know that the standard deviation of the population is 0.5  The sample mean mpg is  We’ll use a level of significance α = 0.05

● Do we reject the null hypothesis?  is 0.11 lower than 29.0  The standard error is (0.5 / √ 40) =  The test statistic is  The 1-sided P-value of is 0.08 > 0.05 = α ● Our conclusion  We do not reject the null hypothesis  We have insufficient evidence that the population mean mpg is less than 29.0

● An example of a right-tailed test ● A bolt manufacturer claims that the defective rate of their product is at most 1.70 per 1,000  H 0 : Defect Rate = 1.70  H 1 : Defect Rate > 1.70 ● We take a sample of size 40  (Somehow) We know that the standard deviation of the population is.06  The sample defect rate is 1.78  We’ll use a level of significance α = 0.05

● Do we reject the null hypothesis?  1.78 is 0.08 higher than 1.70  The standard error is (0.06 / √ 40) =  The test statistic is 8.43  The 1-sided P-value of 8.43 is extremely small ● Our conclusion  We reject the null hypothesis  We have sufficient evidence that the population mean rate is more than 1.70

The hypothesis test calculation and the confidence interval calculation are very similar Not rejecting the hypothesis μ 0 is inside the Confidence interval Rejecting the hypothesis μ 0 is outside the Confidence interval

● An example of a two-tailed test ● A bolt manufacturer claims that the diameter of the bolts average 10.0 mm  H 0 : Diameter = 10.0  H 1 : Diameter ≠ 10.0 ● We take a sample of size 40  (Somehow) We know that the standard deviation of this measurement is 0.3 mm  The sample mean is mm  We’ll use a level of significance α = 0.05

● Do we reject the null hypothesis?  is 0.12 higher than 10.0  The standard error is (0.3 / √ 40) =  The confidence interval is ± , or to  10.0 is outside (10.03, 10.21) ● Our conclusion  We reject the null hypothesis  We have sufficient evidence that the population mean diameter is not 10.0

In the previous section, we assumed that the population standard deviation, σ, was known This is not a realistic assumption σ not being known is a much more practical assumption

● The parallel between Confidence Intervals and Hypothesis Tests carries over here too ● For Confidence Intervals  We estimate the population standard deviation σ by the sample standard deviation s  We use the Student’s t-distribution with n-1 degrees of freedom ● For Hypothesis Tests, we do the same  Use σ for s  Use the Student’s t for the normal

Thus instead of the test statistic knowing σ we calculate a test statistic using s This is the appropriate test statistic to use when σ is unknown

We can perform our hypotheses for tests of a population proportion in the same way as when the sample standard deviation is known

● The process for a hypothesis test of a mean, when σ is unknown is  Set up the problem with a null and alternative hypotheses  Collect the data and compute the sample mean  Compute the test statistic

Either the Classical and the P-value approach can be applied to determine the significance P-value approach Classical approach

● A gasoline manufacturer wants to make sure that the octane in their gasoline is at least 87.0  The testing organization takes a sample of size 20  The sample standard deviation is 0.5  The sample mean octane is ● Our null and alternative hypotheses  H 0 : Mean octane = 87  H A : Mean octane < 87  α = 0.05

● Do we reject the null hypothesis?  is 0.06 lower than 87.0  The standard error is (0.5 / √ 20) = 0.11  0.06 is 0.55 standard error less  The critical t value, with 19 degrees of freedom, is  –1.729 < –0.55, it is not unusual ● Our conclusion  We do not reject the null hypothesis  We have insufficient evidence that the true population mean (mean octane) is less than 87.0

D İĞ ER SLAYTLARA GEÇ! ● In a sample of size n, with x successes, the best estimate of the population proportion is ● Similar to tests for means, we have  Two-tailed tests  Left-tailed tests  Right-tailed tests

If np≥ 5, n(1-p) ≥5 then the sample proportion is approximately normally distributed Just as for confidence intervals, the standard error of the sample mean proportion is

For the standard error of the sample proportion, we use and not

Because we assume that the null hypothesis (p = p 0 ) is true, we should use The test statistic is thus

We can perform our hypotheses for tests of a population proportion in the same way as the hypothesis tests of a population mean Two-tailedLeft-tailedRight-tailed H 0 : p = p 0 H 1 : p ≠ p 0 H 0 : p = p 0 H 1 : p < p 0 H 0 : p = p 0 H 1 : p > p 0

● The process for a hypothesis test of a proportion is  Set up the problem with a null and alternative hypotheses  Collect the data and compute the sample proportion  Compute the test statistic

Either the Classical and the P-value approach can be applied to determine the significance Classical approach P-value approach

● An example ● We believe that 60% of students prefer hamburgers over hot dogs ● A random sample of 200 students found that 102 of them preferred hamburgers ● At α = 0.05, does the data support our belief?  The sample size n = 200  The hypothesized proportion p 0 = 0.60  The sample proportion

● Our hypotheses  H 0 : p = 0.60  H 1 : p ≠ 0.60 ● The standard error is ● The test statistic is

The critical values for α = 0.05 are ± 1.96 The test statistic –2.60 is outside the critical values, so we reject the null hypothesis There is significant evidence that the proportion of students who prefer hamburgers is not 60%

We can perform hypothesis tests of proportions in similar ways as hypothesis tests of means  Two-tailed, left-tailed, and right-tailed tests The normal distribution or the binomial distribution should be used to compute the critical values for this test

1. Dependent samples 2. Independent samples

Introduction ● So far we have covered a variety of models dealing with one population  The mean parameter for one population  The proportion parameter for one population  The standard deviation parameter for one population ● However, there are many real-world applications that need techniques to compare two populations

Examples ● Examples of situations with two populations  We want to test whether a certain treatment helps or not … the measurements are the “before” measurement and the “after” measurement  We want to test the effectiveness of Drug A versus Drug B … we give 40 patients Drug A and 40 patients Drug B … the measurements are the Drug A and Drug B responses  Two precision manufacturers are bidding for our contract … they each have some precision (standard deviation) … are their precisions significantly different

● In certain cases, the two samples are very closely tied to each other ● A dependent sample is one when each individual in the first sample is directly matched to one individual in the second ● Examples  Before and after measurements (a specific person’s before and the same person’s after)  Experiments on identical twins (twins matched with each other

● On the other extreme, the two samples can be completely independent of each other ● An independent sample is when individuals selected for one sample have no relationship to the individuals selected for the other ● Examples  Fifty samples from one factory compared to fifty samples from another  Two hundred patients divided at random into two groups of one hundred

● The dependent samples are often called matched-pairs ● Matched-pairs is an appropriate term because each observation in sample 1 is matched to exactly one in sample 2  The person before  the person after  One twin  the other twin  An experiment done on a person’s left eye  the same experiment done on that person’s right eye

● The method to analyze matched-pairs is to combine the pair into one measurement  “Before” and “After” measurements – subtract the before from the after to get a single “change” measurement  “Twin 1” and “Twin 2” measurements – subtract the 1 from the 2 to get a single “difference between twins” measurement  “Left eye” and “Right eye” measurements – subtract the left from the right to get a single “difference between eyes” measurement

● Specifically, for the before and after example,  d 1 = person 1’s after – person 1’s before  d 2 = person 2’s after – person 1’s before  d 3 = person 3’s after – person 1’s before ● This creates a new random variable d ● We would like to reformulate our problem into a problem involving d (just one variable)

● How do our hypotheses translate?  The two means are equal … the mean difference is zero … μ d = 0  The two means are unequal … the mean difference is non-zero … μ d ≠ 0 ● Thus our hypothesis test is  H 0 : μ d = 0  H 1 : μ d ≠ 0  The standard deviation σ d is unknown ● We know how to do this!

To solve  H 0 : μ d = 0  H 1 : μ d ≠ 0  The standard deviation σ d is unknown This is exactly the test of one mean with the standard deviation being unknown This is exactly the subject we have covered previously

An example … whether our treatment helps or not … helps meaning a higher measurement The “Before” and “After” results BeforeAfterDifference –

● Hypotheses  H 0 : μ d = 0 … no difference  H 1 : μ d > 0 … helps  (We’re only interested in if our treatment makes things better or not)  α = 0.01 ● Calculations  n = 5  d =.88  s d =.83

● Calculations  n = 5  d = 0.88  s d = 0.83 ● The test statistic is ● This has a Student’s t-distribution with 4 degrees of freedom

● Use the Student’s t-distribution with 4 degrees of freedom ● The right-tailed α = 0.01 critical value is 3.75 ● 2.36 is less than 3.75 (the classical method) ● Thus we do not reject the null hypothesis ● There is insufficient evidence to conclude that our method significantly improves the situation ● We could also have used the P-Value method

Matched-pairs tests have the same various versions of hypothesis tests  Two-tailed tests  Left-tailed tests (the alternatively hypothesis that the first mean is less than the second)  Right-tailed tests (the alternatively hypothesis that the first mean is greater than the second)  Different values of α Each can be solved using the Student’s t

● A summary of the method  For each matched pair, subtract the first observation from the second  This results in one data item per subject with the data items independent of each other  Test that the mean of these differences is equal to 0 ● Conclusions  Do not reject that μ d = 0  Reject that μ d = 0... Reject that the two populations have the same mean

Independent Samples ● Two samples are independent if the values in one have no relation to the values in the other ● Examples of not independent  Data from male students versus data from business majors (an overlap in populations)  The mean amount of rain, per day, reported in two weather stations in neighboring towns (likely to rain in both places)

● A typical example of an independent samples test is to test whether a new drug, Drug N, lowers cholesterol levels more than the current drug, Drug C ● A group of 100 patients could be chosen  The group could be divided into two groups of 50 using a random method  If we use a random method (such as a simple random sample of 50 out of the 100 patients), then the two groups would be independent

The test of two independent samples is very similar, in process, to the test of a population mean The only major difference is that a different test statistic is used We will discuss the new test statistic through an analogy with the hypothesis test of one mean

For the test of one mean, we have the variables  The hypothesized mean (μ)  The sample size (n)  The sample mean (x)  The sample standard deviation (s) We expect that x would be close to μ

In the test of two means, we have two values for each variable – one for each of the two samples  The two hypothesized means μ 1 and μ 2  The two sample sizes n 1 and n 2  The two sample means x 1 and x 2  The two sample standard deviations s 1 and s 2 We expect that x 1 – x 2 would be close to μ 1 – μ 2

For the test of one mean, to measure the deviation from the null hypothesis, it is logical to take x – μ which has a standard deviation of approximately

For the test of two means, to measure the deviation from the null hypothesis, it is logical to take (x 1 – x 2 ) – (μ 1 – μ 2 ) which has a standard deviation of approximately

For the test of one mean, under certain appropriate conditions, the difference x – μ is Student’s t with mean 0, and the test statistic has Student’s t-distribution with n – 1 degrees of freedom

Thus for the test of two means, under certain appropriate conditions, the difference (x 1 – x 2 ) – (μ 1 – μ 2 ) is approximately Student’s t with mean 0, and the test statistic has an approximate Student’s t- distribution

This is Welch’s approximation, that has approximately a Student’s t- distribution The degrees of freedom is the smaller of  n 1 – 1 and  n 2 – 1

● We have two independent samples  The first sample of n = 40 items has a sample mean of 7.8 and a sample standard deviation of 3.3  The second sample of n = 50 items has a sample mean of 11.6 and a sample standard deviation of 2.6  We believe that the mean of the second population is exactly 4.0 larger than the mean of the first population  We use a level of significance α =.05 ● We use an example with μ 1 ≠ μ 2 to better illustrate the test statistic

● The test statistic is ● This has a Student’s t-distribution with 39 degrees of freedom ● The two-tailed critical value is 2.02, so we do not reject the null hypothesis ● We do not have sufficient evidence to state that the deviation from 4.0 is significant

Testing Proportions ● We now compare two proportions, testing whether they are the same or not ● Examples  The proportion of women (population one) who have a certain trait versus the proportion of men (population two) who have that same trait  The proportion of white sheep (population one) who have a certain characteristic versus the proportion of black sheep (population two) who have that same characteristic

● The test of two populations proportions is very similar, in process, to the test of one population proportion and the test of two population means ● The only major difference is that a different test statistic is used ● We will discuss the new test statistic through an analogy with the hypothesis test of one proportion

● For the test of one proportion, we had the variables of  The hypothesized population proportion (p 0 )  The sample size (n)  The number with the certain characteristic (x)  The sample proportion ( ) ● We expect that should be close to p 0

● In the test of two proportions, we have two values for each variable – one for each of the two samples  The two hypothesized proportions (p 1 and p 2 )  The two sample sizes (n 1 and n 2 )  The two numbers with the certain characteristic (x 1 and x 2 )  The two sample proportions ( and ) ● We expect that should be close to p 1 – p 2

For the test of one proportion, to measure the deviation from the null hypothesis, we took which has a standard deviation of

For the test of two proportions, to measure the deviation from the null hypothesis, it is logical to take which has a standard deviation of

For the test of one proportion, under certain appropriate conditions, the difference is approximately normal with mean 0, and the test statistic has an approximate standard normal distribution

Thus for the test of two proportions, under certain appropriate conditions, the difference is approximately normal with mean 0, and the test statistic has an approximate standard normal distribution

● We have two independent samples  55 out of a random sample of 100 students at one university are commuters  80 out of a random sample of 200 students at another university are commuters  We wish to know of these two proportions are equal  We use a level of significance α =.05 ● When we calculate np & n(1-p) for each of the two samples, results are affirmative

● The test statistic is ● The critical values for a two-tailed test using the normal distribution are ± 1.96, thus we reject the null hypothesis ● We conclude that the two proportions are significantly different