Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance. In science one begins with a theory, then collects data (hopefully under carefully controlled conditions) and asks the central question: Does the data fit the theory?
Does the data fit the Theory/Model? This question is not as easily answered as you might think. As we know, samples vary, measurements almost always contain small errors, so it is unreasonable to expect exact agreement with a theory/model based upon actual observations. When can we say that the sample we have carefully collected does or does not fit the theory/model?
Today we focus on the population mean We believe we know the true population mean We collect a sample and the sample average differs from what we believe the true population mean to be. Does this mean we have the wrong population mean, or is there a difference just because samples vary?
When can we say that the sample we have carefully collected does not fit the model? There is no one answer: Rather, we calculate the probability that a random sample would vary from that predicted by the theory/model by as much or more than the value we obtained. (This value is called the p-value.) For example: If we believe that the mean height of students at CSUMB is 5’6” we can collect a sample and calculate how likely it is that the average height of a sample of this size would vary from 5’6” by as much or more than that of our sample. The average height we calculate from the sample is called the test statistic.
Ho determines the model. Small p-values indicate the sample data does not fit the model
What Model do we use? We have seen that the average of almost all large samples (of size n) is modeled by a normal distribution with mean equal to the population mean and standard deviation: So, if we know the population standard deviation, we have the two parameters needed for our model and we can ask if the data fits the model. If we do not know the population standard deviation, we can use the sample standard deviation as long as the sample is large. (Generally this means > 25.)
Hypothesis Testing about The Mean The model is defined by the parameters mean µ and standard deviation σ. Since we can use the sample standard deviation in place of σ, we really only have one assumption: That we know the mean of the population. This assumption µ = µ 0 (a known value) is called the null hypothesis and is designated H o
Hypothesis Testing about The Mean The model is defined by the null hypothesis H o :µ = µ 0 (in our example of student heights µ 0 =5’ 6") If the null hypothesis is not true then one of the following alternative hypotheses (H A ) must be true: µ µ 0 If we have no idea which of these to expect, we can state the alternative hypothesis as H A :µ ≠ µ 0 although this is rarely used and I discourage you from ever using it in practice.
Inference: Null Hypothesis The null hypothesis (H 0 ) is the hypothesis/theory that is being tested. H 0 can never be proved, only disproved! This is how the sciences advance, by disproving a theory with data and suggesting an alternative theory that seems to agree with the data. It is always a statement of the value of a population parameter. E.g., H 0 : = 0 signifies the population mean has the value 0. H 0 is presumed TRUE until there is sufficient evidence to reject it.
The Big Idea The null hypothesis provides us with a model for the population from which the sample is selected. A sample is collected, and the sample average (test statistic) is compared to the population parameter. In other words, we place the average of the sample on the model and ask how reasonable is it that we obtain a test statistic that varies from 0 by this much or more. Generally values that are within 2 standard deviations of the (assumed) mean are considered reasonable.
The Model is determined by Ho
Quantifying the improbable: p-value p-value: The probability of observing, when the null hypothesis is true, a value of the test statistic that is as extreme or more extreme than the value observed. (memorize this!) In the preceding example, the value is the p-value of the test.
Inference: Statistical significance Traditionally, the decision to reject H 0 was based upon selection of a level of significance ( ) used to derive a critical value for the test statistic. The critical value set a gating value beyond/beneath which a test statistic must fall in order that H 0 may be rejected. Most technology tools produce p-values directly. The p-values carry more information about the test statistic, since they enable reporting the smallest possible significance level for which the results are statistically significant.
Inference: Conducting a Hypothesis Test (5 steps) Identify the parameter to be tested and state the two hypotheses in symbolic terms. Restate the hypotheses in context of the problem. Analyze the sample data and report the p-value of your test. Interpret the results: Does the data provide evidence against Ho? At which of the standard confidence levels should you reject Ho? State the conclusion in the context of the problem.
Example 1 Standards set by government agencies indicate that Americans should not exceed an average daily sodium intake of 3300 milligrams (mg). To find out whether Americans are exceeding this limit, a sample of 100 Americans is selected. The mean and standard deviation of daily sodium intake are found to be 3400 mg and 1100 mg, respectively.
Inference: Conducting a Hypothesis Test (State H 0, H A ) H 0 : = 3300 mg Americans’ average daily sodium intake is 3300 mg. H A : > 3300 mg Americans’ average daily sodium intake exceeds 3300 mg.
The Test Statistic The test statistic is the value produced from the sample. We place this value on our model (a normal distribution with mean 3300 and standard deviation: ) The p-value is the probability of getting a value of 3400 or larger on this normal curve.
The Test Statistic
Conclusion The p-value represents the chance of getting a value as high or higher than 3400 when the true average is The p-value of.1814 means there is an 18.14% chance that whenever we conduct a similar experiment we would find a sample average of 3400 or higher. We conclude that there is not enough evidence to show that Americans’ average daily sodium intake exceeds 3300 mg.
(1-Confidence level)= significance level
Inference: Guidelines & Language of Statistical Significance Range of p-valueLevel of significance.01 > p-value Results are highly significant..05 > p-value ≥.01 Results are statistically significant..10 > p-value ≥.05 Results tend toward statistical significance. (H 0 usually not rejected) p-value ≥.10 Results are not statistically significant. (H 0 not rejected)
Confidence Levels If p-value is less than 0.1, reject Ho at 90% confidence level, otherwise keep Ho. If p-value is less than 0.05, reject Ho at both 90% and 95% level, otherwise keep Ho. If p-value is less than 0.01, reject Ho at 90%, 95%, and 99% levels, otherwise keep Ho.
Example2: Water Quality An environmentalist group knows that historically a certain stream has had a dissolved oxygen content of 5 mg per liter, with = 0.92 mg. The group collects a liter of water from each of 45 random locations along a stream and measures the amount of dissolved oxygen in each specimen. The sample mean is milligrams (mg) per liter. Is this strong evidence that the stream has a mean dissolved oxygen content of less than 5 mg per liter?
1. State the hypotheses 0 = 5 mg/liter The historical mean of the population H 0 : = 0 The null hypothesis claim is that the true dissolved oxygen level will be exactly 5 mg/liter H A : < 0 The alternative hypothesis states that the true dissolved oxygen level will be less than 5 mg/liter
2. Calculate the test statistic Use the values… Sample mean is mg/liter Historical mean is 5 mg/liter Population standard deviation is 0.92 mg/liter Sample size is 45 The standard error for the sample Since the sample average is below the mean, is an upper limit.
3. Find the p-value The p-value is the probability of getting a value of or smaller on the normal curve with mean=5 and standard deviation= Using the normal worksheet you can find the p- value is about
4. Interpret the Results p-value= A mean sample level as low as mg/liter would occur 7.3% of the time if the true population mean were still 5 mg/liter. This is modest evidence that the true mean dissolved oxygen level is less than 5 mg/liter.
5. State your conclusion in the context of the problem p-value= Significance level = 0.10 < 0.10 Reject the null hypothesis and accept the alternative hypothesis Significance level = 0.05 > 0.05 Fail to reject the null hypothesis