Chapter 10.2 TESTS OF SIGNIFICANCE
TEST OF SIGNIFICANCE Two basic types of : ”Statistical Inference” The One we have just studied: CONFIDENCE INTERVALS Goal: To estimate a population parameter The 2nd type: TEST OF SIGNIFICANCE Goal: To assess evidence provided by data about a claim regarding a population parameter
Example 10.8: I’M A GREAT FREE-THROW SHOOTER Lynch claims: “I am an 80% free-thrower” To test this … you ask me to shoot 20 free-throws I ONLY make 8 out of 20 So you REJECT (disbelieve) my initial claim Your logic is based on how rare it would be for me to only go 8/20 (40%) IF I were in reality p = 0.8 In statistical reality: This small probability causes you to reject my claim
Example 10.9 SWEETENING COLAS A sample of n = 10. Sweetness tasters taste a batch of cola before … and then after high temperature storage for a month (simulating 4 months of storage) Matched pairs design … each tester gives sweetness score on 1- 10 scale … before, then after. A “difference” of Before minus After is shown below. 2.0 0.4 0.7 -0.4 2.2 -1.3 1.2 1.1 2.3
Example 10.9 SWEETENING COLAS Are these data strong enough evidence to conclude that the cola lost sweetness during storage? Find: One of two things must be true: A) The average of 1.02 reflects a real loss in sweetness (or) B) We could achieve a loss of 1.02 by chance 2.0 0.4 0.7 -0.4 2.2 -1.3 1.2 1.1 2.3
NULL HYPOTHESIS - H0 The statement being tested in a test of significance is called the null hypothesis. The test of significance is designed to assess the strength of the evidence is against H0(h naught). Usually the null hypothesis is a statement of “no effect” or “no difference”.
Example 10.9 SWEETENING COLAS FIRST STEP: Identification of the population being concluded about in this case the population parameter μ – sweetness loss that all consumers will experience. SECOND STEP: A statement of the HYPOTHESES: H0 (null hypothesis) and Ha (alternative hypothesis) If H0 is true … the “difference” is just due to chance … and there is NO REAL CHANGE in the population If Ha is true … the suspected drop in sweetness … the “difference” is NOT due to chance … and so there IS A REAL CHANGE in the population!
Example 10.9 SWEETENING COLAS Assume that the standard deviation of sweetness rankings is So the standard deviation of the sampling distribution would be…? Now, how does an look like now? Z-score? P(Z > 3.228) = ? 0.00062 (a very low P-Value … it is statistically significant) So … we would reject H0 in favor of Ha
Exercise 10.28: SPENDING ON HOUSING
ONE-SIDED AND TWO-SIDED ALTERNATIVES Is there a loss? …. A gain? … more than? … less than? Two-sided: Is there a difference? … a change? Was there an effect?
Example 10.10: STUDYING JOB SATISFACTION Does job satisfaction DIFFER for assembly workers if their work is machine-paced vs. self-paced? 28 subjects … 14 to group I … 14 to group II Job Diagnosis Survey (JDS) after two weeks Switched groups … two more weeks of work JDS again after two more weeks Matched Pairs: “Difference X” = Self-paced minus Machine- paced satisfaction score The authors of the study want to know, do the working conditions have different levels of satisfaction?
Exercise 10.30: HOUSEHOLD INCOME
Exercise 10.32: SERVICE TECHNICIANS
P-VALUE The probability, computed assuming that H0 is true, that the observed outcome would take on a value as extreme or more extreme than that actually observed is called the P-Value of the test. The smaller the P-Value is, the stronger the evidence is against H0 provided by the data.
Example 10.11: CALCULATING ANOTHER ONE-SIDED TEST This time the taste-testers examined a “new” cola. The “new cola” sample mean: Hypotheses? So, what is: Draw it! z? Recall P-Value? Normalcdf(0.95, 10) = 0.1711
TEST FOR A POPULATION MEAN . One-sample z-statistic
STATISTICAL SIGNIFICANCE If the P-Value is as small or smaller than alpha (), we say that the data are statistically significant at the level
STATISTICAL SIGNIFICANCE
Example 10.13: EXECUTIVES’ BLOOD PRESSURE NCHS reports that the mean systolic blood pressure for all males 35-44 is 128 with standard deviation 15 72 subjects … executives in this age group Is this evidence to conclude that the company’s execs have a different mean than that of the general population? Population/parameter of interest? Set up hypotheses. Choose inference procedure. z? … P? Interpret.
Example 10.13: EXEUTIVES’ BLOOD PRESSURE
Example 10.14: CAN YOU BALANCE YOUR CHECKBOOK? NAEP (National Assessment of Education Progress) survey reports that a score of 275 on its quantitative test is sufficient to indicate skill needed 840 subjects … young Americans . Is this evidence to conclude that the mean of ALL young men is below 275? Population/parameter of interest? Set up hypotheses. Choose inference procedure. z? … P? Interpret.
Example 10.14: CAN YOU BALANCE YOUR CHECKBOOK?
Example 10.15: DETERMINING SIGNIFICANCE Back to this past example again … where we examined whether the mean of ALL young men is below 275? We can look at this problem from a slightly different perspective Assuming alpha = 0.05 – and that we have a one tail test With 0.05 in ONE TAIL … z* = 1.645 … think … Why? All we need to do then is to examine if the z-score is “further out” than z* Since z = 1.45 in this case, then NO, we fail to reject H0
Example 10.15: DETERMINING SIGNIFICANCE
Example 10.16: IS THE SCREEN TENSION OK? Recall the problem from a while ago with 20 TVs. Is there evidence at the = 0.01 level to conclude a difference from the proper prescribed tension of 275mV? Population/parameter of interest? Set up hypotheses. One-tail test or a two-tail? Choose inference procedure. z? … P? What is the area in each tail? What is z*? Interpret.
Example 10.16: IS THE SCREEN TENSION OK?
CONFIDENCE INTERVALS AND TWO-SIDED TESTS A level significance test rejects the hypothesis: exactly when the value of falls outside the 1 – confidence interval. Compare the very last example to the 99% CI we created last week: (281.5, 331.1) Consider instead.