Chapter Making Sense of Statistical Significance & Inference as Decision
Choosing a Level of Significance “Making a decision” … the choice of alpha depends on: Plausibility of H 0 : HHow entrenched or long-standing is the current belief. If it is strongly believed, then strong evidence (small ) will be needed. Subjectivity involved. Consequences of rejecting H 0 : EExpensive changeover as a result of rejecting H 0 ? Subjectivity! No sharp border – only increasingly strong evidence P-Value of vs at the,0.05 alpha-level? No real practical difference.
Statistical vs. Practical Significance Even when we reject the Null Hypothesis – and claim – “There is an effect present” But how big or small is the “effect”? Is a slight improvement a “big enough deal”? Statistical significance is not the same thing as practical significance. Pay attention to the P-Value! Look out for outliers Blind application of Significance Tests is not good A Confidence Interval can also show the size of the effect
When is it not valid for all data? Badly designed experiments and surveys often produce invalid results. Randomization is paramount! Is the data from a normal distribution?
HAWTHORNE EFFECT Does background music cause an increase in productivity? After discussing the study with workers - a significant increase in productivity occurred Problems: No control … and the idea of being studied Any change would have produced similar effects
Beware the Multiple Analyses If you test long enough … you will eventually find significance by random chance. Do not go on a “witch-hunt” … looking for variables that already stand out … then perform the Test of Significance on that. Exploratory searching is OK … but then design a study.
ACCEPTANCE SAMPLING A decision MUST be made at the end of an inference study: Accept the lot Reject the lot H 0 : the batch of potato chips meets standards H a : the potato chips do not meet standards We hope our decision is correct, but …we could accept a bad batch, or we could reject a good one.
TYPE I AND TYPE II ERRORS If we reject H 0 (accept H a ) when in fact H 0 is true, this is a Type I error. If we reject H a (accept H 0 ) when in fact H a is true, this is a Type II error.
EXAMPLE ARE THE POTATO CHIPS TOO SALTY? Mean salt content is supposed to be 2.0mg The content varies normally with =.1 mg n = 50 chips are taken by inspector and tests each chip The entire batch is rejected if the mean salt content of the 50 chips is significantly different from 2mg at the 5% level Hypotheses? z* values? Draw a picture with acceptance and rejection regions shaded.
EXAMPLE ARE THE POTATO CHIPS TOO SALTY? What if we actually have a batch where the true mean is μ = 2.05mg? There is a good chance that we will reject this batch, but what if we don’t! What if we accept the H 0 and fail to reject the “out of spec … bad” batch? This would be an example of a Type II error …accepting μ = 2 when in reality μ = 2.05
Finding the probability of a Type II error Step 1 … find the interval if acceptance for sample means, assuming the μ = μ 0 = 2. … (1.9723, ) Now find the probability that this interval/region would contain a sample mean about μ a = 2.05 Standardize each endpoint of the interval relative to μ a = 2.05 and find the area of the alternative distribution that overlaps the H 0 distribution acceptance interval. EXAMPLE ARE THE POTATO CHIPS TOO SALTY?
So … = … a Type II Error … we are likely to (in error) accept almost 6% of batches too salty at the 2.05mg level And … = 0.05 … a Type I Error … we are likely to (in error) reject 5% of batches salty at the perfect 2mg level
SIGNIFICANCE AND TYPE I ERROR The significance level alpha of any fixed number is the probability of a Type I error. That is, is the probability that the test will reject H 0 when H 0 is nevertheless true.
POWER The probability that a fixed level significance test will reject H 0 when a particular H a is in fact true is called the power of the test against the alternative. The power of a test is 1 minus the Probability of a Type II error for that alternative … Power =1 -
INCREASING POWER Increase alpha ( ) … and “work at odds” of each other Consider an alternative (H a ) farther away Increase sample size (n) Decrease sigma ( )