Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hypothesis Testing: One Sample Mean or Proportion

Similar presentations


Presentation on theme: "Hypothesis Testing: One Sample Mean or Proportion"— Presentation transcript:

1 Hypothesis Testing: One Sample Mean or Proportion
Introduction, Section Testing a Mean, Section 10.5

2 Hypothesis: A Statement about a Population Parameter
The Labor Department makes a statement Mean annual earnings of white men in 1968 was $8000 (quantitative variable) A food company claims The boxes contain 16 ounces of cereal (quantitative variable) Advertising firm states that 70% of customers like the new package design (qualitative variable) it is proposing 1. Overview In this section, we decide whether to accept or reject a claim concerning a particular population parameter. We use the population mean as our specific example, but what you will learn is applicable for all hypothesis testing. Imagine that we are doing a labor market study of older men, aged 45-64, in the late 1960s. We postulate that the population mean annual earnings was $8,000. This statement is called the null hypothesis. H0:  = 8000 The alternative hypothesis is a second statement that contradicts the null hypothesis. In this case, H1:   8000 Notice that all possible values of  are covered by the null and alternative hypotheses. We next draw a random sample of size n from the population. We obtain some data from the National Longitudinal Studies that were collected in the late 1960s. These data represent a random sample of men. The sample size is n = 1,976. The population standard deviation,  = , is known. The standard error of the sampling distribution is = 94.28 The sample is so large that we can fall back on the CLT and feel confident that the sampling distribution of the mean is approximately normal. PP 2

3 Null and Alternative Hypotheses
Claim being made is the null hypothesis H0:  = $8000 (Labor Dept. statement) H0:  = 16.0 ounces (Desired weight if machine is working correctly) H0: = 0.70 (Ad firm) This statement remains unless a challenger can refute it The alternative hypothesis is a second statement that contradicts the null hypothesis PP 2

4 Null and Alternative Hypotheses
In this case, H1:   $8000 Annual earnings are not equal to $8000 H1:   16 ounces The process is not in control H1:  0.70 The proportion of customers who like the new design is different from 70% I will focus on the mean in developing hypothesis testing. PP 2

5 The Research Challenge
Use sample data to decide on the validity of the null hypothesis Consider H0:  = $8000 (Annual earnings of white men in 1968) Obtain a random sample of men from 1968 Use National Longitudinal Studies Sample size is n = 1,976 Known population standard deviation,  = Standard error of the sampling distribution = 94.28 The sample is so large that we can fall back on the CLT Confident that the sampling distribution of the mean is approximately normal We obtain some data from the National Longitudinal Studies that were collected in the late 1960s. These data represent a random sample of men. The sample size is n = 1,976. The population standard deviation,  = , is known. The standard error of the sampling distribution is = 94.28 The sample is so large that we can fall back on the CLT and feel confident that the sampling distribution of the mean is approximately normal. PP 2

6 Rationale We compare the mean of the sample with the hypothesized population mean,  = 8000 Is the difference between the sample mean and the hypothesized mean (sample mean – 8000) “small” or “large” ? A “small” difference Suggests that the sample may be drawn from a population with a mean of 8000 The null hypothesis can not be rejected A “large” difference Suggests that it is unlikely that the sample came from a population with a mean of 8000 The sample appears to be drawn from a population with a mean other than 8000 We reject the null hypothesis Such a test result is said to be statistically significant Such a large difference would not occur by chance alone. PP 2

7 Rationale How can we decide whether the difference between the sample mean and the hypothesized mean (sample mean – 8000) is “small” or “large”? Consider how the distribution of sample means would look if the null hypothesis was true “Under the null hypothesis” Consider the claim to be correct for the moment Clearly we need to be more specific about when a sample mean is close to 8000 and it supports our hypothesis and when a sample mean is so far away from 8000 that it does not support the hypothesis. Consider the sampling distribution of the sample mean under the null hypothesis. Under the null hypothesis means that we consider the claim to be correct for the moment. PP 2

8 Under the Null Hypothesis
We know what to expect if the null H0 is true 95.44% of the sample means ( ) lie within 2 standard errors, , of the population mean Sampling Distribution of Sample Means, Normally Distributed Our knowledge of the sampling distribution of sample means allows us to consistently say when a difference is “small” or “large”. PP 2

9 Statistical Decision Approaches
Critical value Establish boundaries beyond which it is unlikely to observe the sample mean if the null hypothesis is true P-value Express the probability of observing our sample mean or one more extreme if the null hypothesis is true Approaches yield same results Another approach is confidence intervals. PP 2

10 Types of Error Consider a criminal trial by jury in the US
The individual on trial is either innocent or guilty Assumed innocent by law After evidence is presented, the jury finds the defendant either guilty or not guilty A test of hypothesis can be compared to a criminal trial by jury in the US. The individual on trial is either innocent or guilty, but is assumed innocent by law. After evidence is presented, the jury finds the defendant either guilty or not guilty. PP 2

11 Two Types of Error Verdict of Jury Defendant Innocent Guilty
Not Guilty Correct Incorrect PP 2

12 Types of Error in Hypothesis Testing
Statistical Decision Based on Sample Population Hypothesis is true Hypothesis is false Do not reject claim Correct Decision Probability = 1- Type II error Probability =  Reject claim Type I error Probability =  Power of the test = 1- Analogously, the true population mean is either 8000 or it is not We begin by assuming the null hypothesis is correct and we consider the evidence that is presented in the form of a sample of size n. Let's assume that the population mean is indeed Because the decision is based on a sample, it could happen purely by chance that we draw a sample and observe a sample mean that is extreme. This error is a Type I error--we reject a true hypothesis. We are interested in the chance of our making this error, that is, the probability of a Type I error. Notice that the area in the tails, which is called , gives the probability of the error to us. In order to understand errors in hypothesis testing, a table is useful. The table emphasizes: the two possible truths, the hypothesis is right or the hypothesis is wrong and the two possible statistical decisions based on the samples. PP 2

13 Hypothesis Testing All statistical hypotheses consist of
A null hypothesis and an alternative hypothesis These two parts are constructed to contain all possible outcomes of the experiment or study The null hypothesis states the null condition exists There is nothing new happening, the old theory is still true, the old standard is correct and the system is in control The alternative hypothesis contains the research challenge or what must be demonstrated The new theory is true, there are new standards, the system is out of control, and/or something is happening PP 2

14 Null and Alternative Hypotheses
The null and the alternative hypotheses refer to population parameters Never to sample estimates This is correct: H0:  = This is WRONG: H0: = 8000 Why? Because we have information about the sample and so we know what the sample mean is. PP 2

15 Null Hypothesis The null hypothesis contains the equality sign
Never put the equality sign in the alternative hypothesis Null hypotheses usually do not contain what the researcher believes to be true Typically researchers wish to reject the null hypothesis Remember the term “null” The null hypothesis typically implies No change, no difference, and nothing noteworthy has happened PP 2

16 Alternative Hypothesis
The burden of proof is placed in the alternative hypothesis A claim is made H0:  = 8000 I do not believe the claim The alternative hypothesis states my challenge H1:   8000 I must demonstrate that the claim is false This alternative is called a two tail or two sided alternative Want to know if the true mean is higher than or lower than the claim Unless the sample contains evidence that allows me to reject the null hypothesis, the original claim stands as valid Form the alternative hypothesis first, since it embodies the challenge State the alternative hypothesis. If we reject the null hypothesis, we implicitly accept the alternative. Ideally we would like to have a specific alternative hypothesis, because we can then calculate the probability of a Type II error. For example: H0:  = 8000 H1:  = 8350 We can calculate that if the null hypothesis is false and the true population mean equals 8350, the probability of a Type II error is Most of the time researchers do not have any specified value for the alternative hypothesis. In this case we use a general form for the alternative: H0:   This alternative is called a two tail or two sided alternative. We will look for evidence in both tails of the sampling distribution for an extreme sample mean. PP 2

17 Specify the Level of Significance: The Risk Of Rejecting A True Null Hypothesis
How much error are you willing to accept? The significance level of the test is the maximum probability that the null hypothesis will be rejected incorrectly, a Type I error Use  to designate the risk and we set  = 0.05, 0.02, 0.01 by convention When possible, set the risk of  that we want Do this in studies for which we determine the appropriate sample size We take the sample size as a given in this course. PP 2

18 Form the Rejection Region: Critical Values
? Sampling Distribution of Sample Means, Normally Distributed .95 do not reject Z Critical values divide the sampling distribution under the null hypothesis into a rejection and a non-rejection area Set ⍺ = 0.05, 2-sided test (0.05/2 = 0.025) The blue lines are the boundaries Sample means Z values .025 reject .025 reject Determine the critical values that divide the rejection from the non-rejection area. Let's set  = We need to know the Z value that cuts the distribution such that 5% of sample means lie in the tails or 2.5% lie in each tail. How do we find this unknown Z value? First of all, let's designate this unknown Z value as Z.05/2 or Z If .025 lies in the tail, then: = lies between the mean of 0 and this unknown Z value. This area can be looked up in the standard normal tables. Reading from the probabilities out to the Z values, we find a value of % of Z values lie between 0 and % of Z values lie within 1.96 standard errors from the mean. We will use these values as our critical values that separate the rejection from the non-rejection region. PP 2

19 Determine Specific Critical Values
Standard Normal Distribution Need the Z value such that 5% of standardized values lie in the two tails 2.5% lies in each tail Designate this unknown Z value as Z.05/2 or Z.025 If .025 lies in the tail of the distribution, then = lies between the mean of 0 and this unknown Z value Standard normal tables Read from the probabilities out to the Z values, we find a value of 1.96 1-a –z.025 +z.025 Do Not Reject H -1.96 1.96 .4750 .025 We need to know the Z value that cuts the distribution such that 5% of sample means lie in the tails or 2.5% lie in each tail. How do we find this unknown Z value? First of all, let's designate this unknown Z value as Z.05/2 or Z If .025 lies in the tail, then: = lies between the mean of 0 and this unknown Z value. This area can be looked up in the standard normal tables. Reading from the probabilities out to the Z values, we find a value of % of Z values lie between 0 and % of Z values lie within 1.96 standard errors from the mean. We will use these values as our critical values that separate the rejection from the non-rejection region. PP 2

20 PP 2

21 Rationale of Test 95% of Z values lie within 1.96 standard deviations from the mean When the null hypothesis is true 95% of sample means will lie within 1.96 standard errors of the population mean 5% of sample means will lie beyond 1.96 standard errors of the population mean Argument by contradiction View 5% as a small probability If a sample means falls in the 5% region We are skeptical about null The rationale of the test is as follows. First, tests are developed assuming the null hypothesis is correct. Assume for example that the population mean does equal We draw a sample and calculate the sample mean. If the sample mean is “close” to the hypothesized mean, we will accept the null hypothesis. If the sample mean is “far” from the center of the distribution, we will reject the null hypothesis. Statistically speaking, the sample mean is far if it lies beyond 1.96 standard errors from the hypothesized mean, We know that under the null hypothesis, 95% of the time (in 95 out of 100 samples), we will obtain a sample mean that lies within 1.96 standard errors from A decision rule then is DR: if (-Z/2 < test statistic Z < Z/2) accept the null hypothesis. PP 2

22 Determine the Correct Test Statistic
Test Statistic is a formula that summarizes sample information Test Statistic = (sample statistic – claim about population parameter)/ standard error Is the population standard deviation, σ, known? If yes, use Z values If no, use the student t distribution PP 2

23 State the Decision Rule (DR)
DR: if (-Z/2 < Z test statistic < Z/2) do not reject the null hypothesis, two sided test If ⍺ = 0.05, Z.025 = ± 1.96 If (-1.96 ≤ Z test statistic ≤ 1.96) do not reject DR : if (-tn-1,/2 < t - test statistic < tn-1, /2) do not reject the null hypothesis We compare the value of the test statistic to the critical value and determine whether the test statistic falls into the acceptance or the rejection region. We reject or do not reject the null hypothesis. This is the statistical decision. Theoretically, we do not accept hypothesis both because it is not the nature of scientific inquiry and also because we do not want to deal with the problem of accepting a false hypothesis. PP 2

24 Statistical Decision and Conclusions
Reject or do not reject the null hypothesis This is the statistical decision Online homework refer to this as the conclusion State the conclusions in terms of the problem Simple statement in English What is the business decision? What is the economic policy conclusion? Theoretically, we do not accept hypothesis both because it is not the nature of scientific inquiry and also because we do not want to deal with the problem of accepting a false hypothesis. PP 2

25 Problem - Income in the 1960s
Consider our example H0:  = $8000 H1:   $8000 Let  = .05 Z.025 = 1.96 DR: if (-1.96 ≤ Z test statistic ≤ 1.96) do not reject The sample mean, , =$ Sample size is n = 1,976 Known population standard deviation,  = Standard error of the sampling distribution = 94.28 PP 2

26 Calculate the Test Statistic
-1.97 Z Sampling Distribution of Sample Means, Normally Distributed .95 do not reject -1.96 1.96 7814.1 Convert the sample mean into the test statistic, a Z test How far does lie from 8000 in terms of the number of standard errors? Z = ( )/ = -1.97 Reject the null hypothesis at a 5% level of significance, two-tailed test PP 2

27 Change the Level of Significance
Lower the risk of a Type I error Let  =0 .01, two sided test Z.01/2 = Z.005 =  2.58 DR: if (-2.58 ≤ Z test statistic ≤ 2.58) do not reject Test Statistic = -1.97 No evidence to reject Set  before any statistical tests are performed Z.01/2 = Z.005 =  The acceptance region becomes wider. The decision rule is now DR: if (-2.58 < Z test statistic < 2.58) accept null hypothesis. = Look up in probability, read out to the Z value of 2.58 – 2.58. Thus we would reject the null hypothesis at a 1% level of significance, two tailed test. We are reducing the probability of rejecting a true null hypothesis. However, if the null hypothesis is false, we are increasing our chances of accepting a false null hypothesis. The outcomes of changing the level of significance suggest that we should always set  before any statistical tests are performed. That way we won't chose a level of significance that suits our purpose. PP 2

28 Lower the Level of Significance to 0.01
Let  =0 .01, two sided test Z.01/2 = Z.005 If .005 lies in the tail of the distribution, then = lies between the mean of 0 and this unknown Z value Standard normal tables Read from the closest probabilities out to the Z values, we find a value of 2.57 or 2.58 1-a –z.005 +z.005 Do Not Reject H -2.58 2.58 .4950 .005 PP 2

29 Standard Normal Distribution
PP 2 Between 2.57 and 2.58

30 Student t Distribution
Z.005 = = 2.58 df\p 0.4 0.25 0.1 0.05 0.025 0.01 0.005 0.0005 1 0.3249 1.0000 3.0777 6.3138 12.706 31.820 63.656 636.61 2 0.2887 0.8165 1.8856 2.9200 4.3027 6.9646 9.9248 31.599 3 0.2767 0.7649 1.6377 2.3534 3.1825 4.5407 5.8409 12.924 29 0.2557 0.6830 1.3114 1.6991 2.0452 2.4620 2.7564 3.6594 30 0.2556 0.6828 1.3104 1.6973 2.0423 2.4573 2.7500 3.6460 inf 0.2533 0.6745 1.2816 1.6449 1.9600 2.3264 2.5758 3.2905 PP 2

31 Online Homework - Chapter 10 Intro to Hypothesis Testing and Testing a Mean
CengageNOW second assignment CengageNOW: Chapter 10 Intro to Hypothesis Testing CengageNOW third assignment CengageNOW: Chapter 10 Testing a Mean PP 2


Download ppt "Hypothesis Testing: One Sample Mean or Proportion"

Similar presentations


Ads by Google