Hypothesis Testing: One Sample Mean or Proportion

Slides:



Advertisements
Similar presentations
Statistics Hypothesis Testing.
Advertisements

Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
1 Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Type I and Type II Errors One-Tailed Tests About a Population Mean: Large-Sample.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and Alternative Hypotheses Type I and Type II Errors Type I and Type II Errors.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
Chapter 9 Hypothesis Testing
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
Section 7.1 Hypothesis Testing: Hypothesis: Null Hypothesis (H 0 ): Alternative Hypothesis (H 1 ): a statistical analysis used to decide which of two competing.
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Pengujian Hipotesis Nilai Tengah Pertemuan 19 Matakuliah: I0134/Metode Statistika Tahun: 2007.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 9-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Chapter 8 Introduction to Hypothesis Testing
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chapter 8 Introduction to Hypothesis Testing
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 TUTORIAL 6 Chapter 10 Hypothesis Testing.
© 1999 Prentice-Hall, Inc. Chap Chapter Topics Hypothesis Testing Methodology Z Test for the Mean (  Known) p-Value Approach to Hypothesis Testing.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
CHAPTER 10: Hypothesis Testing, One Population Mean or Proportion
Chapter 10 Hypothesis Testing
© 2002 Thomson / South-Western Slide 9-1 Chapter 9 Hypothesis Testing with Single Samples.
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
Introduction to Hypothesis Testing
CHAPTER 2 Statistical Inference 2.1 Estimation  Confidence Interval Estimation for Mean and Proportion  Determining Sample Size 2.2 Hypothesis Testing:
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
© 2002 Prentice-Hall, Inc.Chap 7-1 Statistics for Managers using Excel 3 rd Edition Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter 8 Inferences Based on a Single Sample: Tests of Hypothesis.
Fundamentals of Hypothesis Testing: One-Sample Tests
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
1 1 Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University © 2002 South-Western/Thomson Learning.
Hypothesis Testing with ONE Sample
Chapter 10 Hypothesis Testing
© 2003 Prentice-Hall, Inc.Chap 7-1 Business Statistics: A First Course (3 rd Edition) Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Introduction to Hypothesis Testing. 2 What is a Hypothesis? A hypothesis is a claim A hypothesis is a claim (assumption) about a population parameter:
Lecture 7 Introduction to Hypothesis Testing. Lecture Goals After completing this lecture, you should be able to: Formulate null and alternative hypotheses.
Introduction to Hypothesis Testing: One Population Value Chapter 8 Handout.
1 1 Slide © 2003 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 20 Testing hypotheses about proportions
Testing of Hypothesis Fundamentals of Hypothesis.
Chapter 8 Introduction to Hypothesis Testing ©. Chapter 8 - Chapter Outcomes After studying the material in this chapter, you should be able to: 4 Formulate.
Chapter 7 Inferences Based on a Single Sample: Tests of Hypotheses.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter Seven Hypothesis Testing with ONE Sample.
Introduction Suppose that a pharmaceutical company is concerned that the mean potency  of an antibiotic meet the minimum government potency standards.
© Copyright McGraw-Hill 2004
What is a Hypothesis? A hypothesis is a claim (assumption) about the population parameter Examples of parameters are population mean or proportion The.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
C HAPTER 4  Hypothesis Testing -Test for one and two means -Test for one and two proportions.
C HAPTER 2  Hypothesis Testing -Test for one means - Test for two means -Test for one and two proportions.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Testing Chapter Hypothesis Testing  Developing Null and Alternative Hypotheses  Type I and Type II Errors  One-Tailed Tests About.
Module 10 Hypothesis Tests for One Population Mean
Hypothesis Testing: Hypotheses
Chapter 9: Hypothesis Testing
Chapter 9 Hypothesis Testing: Single Population
Presentation transcript:

Hypothesis Testing: One Sample Mean or Proportion Introduction, Section 10.1 -10.3 Testing a Mean, Section 10.5

Hypothesis: A Statement about a Population Parameter The Labor Department makes a statement Mean annual earnings of white men in 1968 was $8000 (quantitative variable) A food company claims The boxes contain 16 ounces of cereal (quantitative variable) Advertising firm states that 70% of customers like the new package design (qualitative variable) it is proposing 1. Overview In this section, we decide whether to accept or reject a claim concerning a particular population parameter. We use the population mean as our specific example, but what you will learn is applicable for all hypothesis testing. Imagine that we are doing a labor market study of older men, aged 45-64, in the late 1960s. We postulate that the population mean annual earnings was $8,000. This statement is called the null hypothesis. H0:  = 8000 The alternative hypothesis is a second statement that contradicts the null hypothesis. In this case, H1:   8000 Notice that all possible values of  are covered by the null and alternative hypotheses. We next draw a random sample of size n from the population. We obtain some data from the National Longitudinal Studies that were collected in the late 1960s. These data represent a random sample of men. The sample size is n = 1,976. The population standard deviation,  = 4191.20, is known. The standard error of the sampling distribution is = 94.28 The sample is so large that we can fall back on the CLT and feel confident that the sampling distribution of the mean is approximately normal. PP 2

Null and Alternative Hypotheses Claim being made is the null hypothesis H0:  = $8000 (Labor Dept. statement) H0:  = 16.0 ounces (Desired weight if machine is working correctly) H0: = 0.70 (Ad firm) This statement remains unless a challenger can refute it The alternative hypothesis is a second statement that contradicts the null hypothesis PP 2

Null and Alternative Hypotheses In this case, H1:   $8000 Annual earnings are not equal to $8000 H1:   16 ounces The process is not in control H1:  0.70 The proportion of customers who like the new design is different from 70% I will focus on the mean in developing hypothesis testing. PP 2

The Research Challenge Use sample data to decide on the validity of the null hypothesis Consider H0:  = $8000 (Annual earnings of white men in 1968) Obtain a random sample of men from 1968 Use National Longitudinal Studies Sample size is n = 1,976 Known population standard deviation,  = 4191.20 Standard error of the sampling distribution = 94.28 The sample is so large that we can fall back on the CLT Confident that the sampling distribution of the mean is approximately normal We obtain some data from the National Longitudinal Studies that were collected in the late 1960s. These data represent a random sample of men. The sample size is n = 1,976. The population standard deviation,  = 4191.20, is known. The standard error of the sampling distribution is = 94.28 The sample is so large that we can fall back on the CLT and feel confident that the sampling distribution of the mean is approximately normal. PP 2

Rationale We compare the mean of the sample with the hypothesized population mean,  = 8000 Is the difference between the sample mean and the hypothesized mean (sample mean – 8000) “small” or “large” ? A “small” difference Suggests that the sample may be drawn from a population with a mean of 8000 The null hypothesis can not be rejected A “large” difference Suggests that it is unlikely that the sample came from a population with a mean of 8000 The sample appears to be drawn from a population with a mean other than 8000 We reject the null hypothesis Such a test result is said to be statistically significant Such a large difference would not occur by chance alone. PP 2

Rationale How can we decide whether the difference between the sample mean and the hypothesized mean (sample mean – 8000) is “small” or “large”? Consider how the distribution of sample means would look if the null hypothesis was true “Under the null hypothesis” Consider the claim to be correct for the moment Clearly we need to be more specific about when a sample mean is close to 8000 and it supports our hypothesis and when a sample mean is so far away from 8000 that it does not support the hypothesis. Consider the sampling distribution of the sample mean under the null hypothesis. Under the null hypothesis means that we consider the claim to be correct for the moment. PP 2

Under the Null Hypothesis We know what to expect if the null H0 is true 95.44% of the sample means ( ) lie within 2 standard errors, , of the population mean Sampling Distribution of Sample Means, Normally Distributed Our knowledge of the sampling distribution of sample means allows us to consistently say when a difference is “small” or “large”. PP 2

Statistical Decision Approaches Critical value Establish boundaries beyond which it is unlikely to observe the sample mean if the null hypothesis is true P-value Express the probability of observing our sample mean or one more extreme if the null hypothesis is true Approaches yield same results Another approach is confidence intervals. PP 2

Types of Error Consider a criminal trial by jury in the US The individual on trial is either innocent or guilty Assumed innocent by law After evidence is presented, the jury finds the defendant either guilty or not guilty A test of hypothesis can be compared to a criminal trial by jury in the US. The individual on trial is either innocent or guilty, but is assumed innocent by law. After evidence is presented, the jury finds the defendant either guilty or not guilty. PP 2

Two Types of Error Verdict of Jury Defendant Innocent Guilty Not Guilty Correct Incorrect PP 2

Types of Error in Hypothesis Testing Statistical Decision Based on Sample Population Hypothesis is true Hypothesis is false Do not reject claim Correct Decision Probability = 1- Type II error Probability =  Reject claim Type I error Probability =  Power of the test = 1- Analogously, the true population mean is either 8000 or it is not 8000. We begin by assuming the null hypothesis is correct and we consider the evidence that is presented in the form of a sample of size n. Let's assume that the population mean is indeed 8000. Because the decision is based on a sample, it could happen purely by chance that we draw a sample and observe a sample mean that is extreme. This error is a Type I error--we reject a true hypothesis. We are interested in the chance of our making this error, that is, the probability of a Type I error. Notice that the area in the tails, which is called , gives the probability of the error to us. In order to understand errors in hypothesis testing, a table is useful. The table emphasizes: the two possible truths, the hypothesis is right or the hypothesis is wrong and the two possible statistical decisions based on the samples. PP 2

Hypothesis Testing All statistical hypotheses consist of A null hypothesis and an alternative hypothesis These two parts are constructed to contain all possible outcomes of the experiment or study The null hypothesis states the null condition exists There is nothing new happening, the old theory is still true, the old standard is correct and the system is in control The alternative hypothesis contains the research challenge or what must be demonstrated The new theory is true, there are new standards, the system is out of control, and/or something is happening PP 2

Null and Alternative Hypotheses The null and the alternative hypotheses refer to population parameters Never to sample estimates This is correct: H0:  = 8000 This is WRONG: H0: = 8000 Why? Because we have information about the sample and so we know what the sample mean is. PP 2

Null Hypothesis The null hypothesis contains the equality sign Never put the equality sign in the alternative hypothesis Null hypotheses usually do not contain what the researcher believes to be true Typically researchers wish to reject the null hypothesis Remember the term “null” The null hypothesis typically implies No change, no difference, and nothing noteworthy has happened PP 2

Alternative Hypothesis The burden of proof is placed in the alternative hypothesis A claim is made H0:  = 8000 I do not believe the claim The alternative hypothesis states my challenge H1:   8000 I must demonstrate that the claim is false This alternative is called a two tail or two sided alternative Want to know if the true mean is higher than or lower than the claim Unless the sample contains evidence that allows me to reject the null hypothesis, the original claim stands as valid Form the alternative hypothesis first, since it embodies the challenge State the alternative hypothesis. If we reject the null hypothesis, we implicitly accept the alternative. Ideally we would like to have a specific alternative hypothesis, because we can then calculate the probability of a Type II error. For example: H0:  = 8000 H1:  = 8350 We can calculate that if the null hypothesis is false and the true population mean equals 8350, the probability of a Type II error is .0559. Most of the time researchers do not have any specified value for the alternative hypothesis. In this case we use a general form for the alternative: H0:   8000 This alternative is called a two tail or two sided alternative. We will look for evidence in both tails of the sampling distribution for an extreme sample mean. PP 2

Specify the Level of Significance: The Risk Of Rejecting A True Null Hypothesis How much error are you willing to accept? The significance level of the test is the maximum probability that the null hypothesis will be rejected incorrectly, a Type I error Use  to designate the risk and we set  = 0.05, 0.02, 0.01 by convention When possible, set the risk of  that we want Do this in studies for which we determine the appropriate sample size We take the sample size as a given in this course. PP 2

Form the Rejection Region: Critical Values ? Sampling Distribution of Sample Means, Normally Distributed .95 do not reject Z Critical values divide the sampling distribution under the null hypothesis into a rejection and a non-rejection area Set ⍺ = 0.05, 2-sided test (0.05/2 = 0.025) The blue lines are the boundaries Sample means Z values .025 reject .025 reject Determine the critical values that divide the rejection from the non-rejection area. Let's set  = .05. We need to know the Z value that cuts the distribution such that 5% of sample means lie in the tails or 2.5% lie in each tail. How do we find this unknown Z value? First of all, let's designate this unknown Z value as Z.05/2 or Z.025. If .025 lies in the tail, then: .5000 - .025 = .4750 lies between the mean of 0 and this unknown Z value. This area can be looked up in the standard normal tables. Reading from the probabilities out to the Z values, we find a value of 1.96. 47.5% of Z values lie between 0 and +1.96 95% of Z values lie within 1.96 standard errors from the mean. We will use these values as our critical values that separate the rejection from the non-rejection region. PP 2

Determine Specific Critical Values Standard Normal Distribution Need the Z value such that 5% of standardized values lie in the two tails 2.5% lies in each tail Designate this unknown Z value as Z.05/2 or Z.025 If .025 lies in the tail of the distribution, then 0.5000 - 0.025 = 0.4750 lies between the mean of 0 and this unknown Z value Standard normal tables Read from the probabilities out to the Z values, we find a value of 1.96 1-a –z.025 +z.025 Do Not Reject H -1.96 1.96 .4750 .025 We need to know the Z value that cuts the distribution such that 5% of sample means lie in the tails or 2.5% lie in each tail. How do we find this unknown Z value? First of all, let's designate this unknown Z value as Z.05/2 or Z.025. If .025 lies in the tail, then: .5000 - .025 = .4750 lies between the mean of 0 and this unknown Z value. This area can be looked up in the standard normal tables. Reading from the probabilities out to the Z values, we find a value of 1.96. 47.5% of Z values lie between 0 and +1.96 95% of Z values lie within 1.96 standard errors from the mean. We will use these values as our critical values that separate the rejection from the non-rejection region. PP 2

PP 2

Rationale of Test 95% of Z values lie within 1.96 standard deviations from the mean When the null hypothesis is true 95% of sample means will lie within 1.96 standard errors of the population mean 5% of sample means will lie beyond 1.96 standard errors of the population mean Argument by contradiction View 5% as a small probability If a sample means falls in the 5% region We are skeptical about null The rationale of the test is as follows. First, tests are developed assuming the null hypothesis is correct. Assume for example that the population mean does equal 8000. We draw a sample and calculate the sample mean. If the sample mean is “close” to the hypothesized mean, we will accept the null hypothesis. If the sample mean is “far” from the center of the distribution, we will reject the null hypothesis. Statistically speaking, the sample mean is far if it lies beyond 1.96 standard errors from the hypothesized mean, 8000. We know that under the null hypothesis, 95% of the time (in 95 out of 100 samples), we will obtain a sample mean that lies within 1.96 standard errors from 8000. A decision rule then is DR: if (-Z/2 < test statistic Z < Z/2) accept the null hypothesis. PP 2

Determine the Correct Test Statistic Test Statistic is a formula that summarizes sample information Test Statistic = (sample statistic – claim about population parameter)/ standard error Is the population standard deviation, σ, known? If yes, use Z values If no, use the student t distribution PP 2

State the Decision Rule (DR) DR: if (-Z/2 < Z test statistic < Z/2) do not reject the null hypothesis, two sided test If ⍺ = 0.05, Z.025 = ± 1.96 If (-1.96 ≤ Z test statistic ≤ 1.96) do not reject DR : if (-tn-1,/2 < t - test statistic < tn-1, /2) do not reject the null hypothesis We compare the value of the test statistic to the critical value and determine whether the test statistic falls into the acceptance or the rejection region. We reject or do not reject the null hypothesis. This is the statistical decision. Theoretically, we do not accept hypothesis both because it is not the nature of scientific inquiry and also because we do not want to deal with the problem of accepting a false hypothesis. PP 2

Statistical Decision and Conclusions Reject or do not reject the null hypothesis This is the statistical decision Online homework refer to this as the conclusion State the conclusions in terms of the problem Simple statement in English What is the business decision? What is the economic policy conclusion? Theoretically, we do not accept hypothesis both because it is not the nature of scientific inquiry and also because we do not want to deal with the problem of accepting a false hypothesis. PP 2

Problem - Income in the 1960s Consider our example H0:  = $8000 H1:   $8000 Let  = .05 Z.025 = 1.96 DR: if (-1.96 ≤ Z test statistic ≤ 1.96) do not reject The sample mean, , =$7814.10 Sample size is n = 1,976 Known population standard deviation,  = 4191.20 Standard error of the sampling distribution = 94.28 PP 2

Calculate the Test Statistic -1.97 Z Sampling Distribution of Sample Means, Normally Distributed .95 do not reject -1.96 1.96 7814.1 Convert the sample mean into the test statistic, a Z test How far does 7814.10 lie from 8000 in terms of the number of standard errors? Z = (7814.1-8000)/94.28 = -1.97 Reject the null hypothesis at a 5% level of significance, two-tailed test PP 2

Change the Level of Significance Lower the risk of a Type I error Let  =0 .01, two sided test Z.01/2 = Z.005 =  2.58 DR: if (-2.58 ≤ Z test statistic ≤ 2.58) do not reject Test Statistic = -1.97 No evidence to reject Set  before any statistical tests are performed Z.01/2 = Z.005 = 2.58. The acceptance region becomes wider. The decision rule is now DR: if (-2.58 < Z test statistic < 2.58) accept null hypothesis. .5000-.005=.4950. Look up .4950 in probability, read out to the Z value of 2.58 – 2.58. Thus we would reject the null hypothesis at a 1% level of significance, two tailed test. We are reducing the probability of rejecting a true null hypothesis. However, if the null hypothesis is false, we are increasing our chances of accepting a false null hypothesis. The outcomes of changing the level of significance suggest that we should always set  before any statistical tests are performed. That way we won't chose a level of significance that suits our purpose. PP 2

Lower the Level of Significance to 0.01 Let  =0 .01, two sided test Z.01/2 = Z.005 If .005 lies in the tail of the distribution, then 0.5000 - 0.005 = 0.4950 lies between the mean of 0 and this unknown Z value Standard normal tables Read from the closest probabilities out to the Z values, we find a value of 2.57 or 2.58 1-a –z.005 +z.005 Do Not Reject H -2.58 2.58 .4950 .005 PP 2

Standard Normal Distribution PP 2 Between 2.57 and 2.58

Student t Distribution Z.005 = 2.5758 = 2.58 df\p 0.4 0.25 0.1 0.05 0.025 0.01 0.005 0.0005 1 0.3249 1.0000 3.0777 6.3138 12.706 31.820 63.656 636.61 2 0.2887 0.8165 1.8856 2.9200 4.3027 6.9646 9.9248 31.599 3 0.2767 0.7649 1.6377 2.3534 3.1825 4.5407 5.8409 12.924 29 0.2557 0.6830 1.3114 1.6991 2.0452 2.4620 2.7564 3.6594 30 0.2556 0.6828 1.3104 1.6973 2.0423 2.4573 2.7500 3.6460   inf 0.2533 0.6745 1.2816 1.6449 1.9600 2.3264 2.5758 3.2905 PP 2

Online Homework - Chapter 10 Intro to Hypothesis Testing and Testing a Mean CengageNOW second assignment CengageNOW: Chapter 10 Intro to Hypothesis Testing CengageNOW third assignment CengageNOW: Chapter 10 Testing a Mean PP 2