Common Statistical Mistakes
Mistake #1 Failing to investigate data for data entry or recording errors. Failing to graph data and calculate basic descriptive statistics before analyzing data.
Example: Wrong Decision Due to Error
Test of mu = vs mu not = Variable N Mean StDev SE Mean T P With Without Variable N Mean StDev SE Mean 95.0 % CI With (23.513, ) Without (23.741, )
Mistake #2 Using the wrong statistical procedure in analyzing your data. Includes failing to check that necessary assumptions are met.
Example: Wrong Decision Due to Wrong Analysis Student BEFORE AFTER DIFFA-B Pulse Rates Before and After Marching Paired Data Design, so analyze with Paired t-test.
Example: Wrong Decision Due to Wrong Analysis Paired T for AFTER - BEFORE N Mean StDev SE Mean AFTER BEFORE Difference % CI for mean difference: (2.99, 19.01) T-Test of mean difference = 0 (vs not = 0): T-Value = 4.37 P-Value = 0.02 Conclude mean pulse rate after is greater than mean pulse rate before.
Example: Wrong Decision Due to Wrong Analysis Two sample T for AFTER vs BEFORE N Mean StDev SE Mean AFTER BEFORE % CI for mu AFTER - mu BEFORE: ( -15.3, 37.3) T-Test mu AFTER = mu BEFORE (vs not =): T = 1.07 P = 0.33 DF = 5 Conclude no difference in mean pulse rates before and after marching.
Mistake #3 Failing to design your study so that it has high enough power to call meaningful differences “significantly different.” Includes concluding that the null hypothesis is true. Should be “not enough evidence to say the null is false.”
Example: Low Power Success = Yes, I recycle. Gender X N Sample p Male Female Estimate for p(1) - p(2): % CI for p(1) - p(2): ( , ) Test for p(1) - p(2) = 0 (vs not = 0): Z = P-Value = A number of students said that they were surprised that the hypothesis test said “no difference in percentages.”
Example: Low Power Power and Sample Size Test for Two Proportions Testing proportion 1 = proportion 2 (versus not =) Calculating power for: proportion 1 = 0.55 and proportion 2 = 0.70 Alpha = 0.05 Difference = Sample Size Power *Sample size = # in EACH group
Mistake #4 Failing to report a confidence interval as well as the P-value. P-value tells you if statistically significant. Confidence interval tells you what the population value might be.
Example: A Significant, but Potentially Meaningless Difference Two sample T for Phone Gender N Mean StDev SE Mean Male Female % CI for mu (1) - mu (2): ( -142, -5) T-Test mu (1) = mu (2) (vs not =): T = P = DF = 135 P-value tells us significant difference, but confidence interval tells us that the difference in the averages could be as small as 5 minutes.
Incidentally…. Outliers
Removing Outliers … Two sample T for Phone Gender N Mean StDev SE Mean Male Female % CI for mu (1) - mu (2): ( , -35) T-Test mu (1) = mu (2) (vs not =): T = P = DF = 121 The difference in male and female phone usage becomes even more significant. We are 95% confident that the difference in the averages is now more than 35 minutes.
Mistake #5 “Fishing” for significant results. That is, performing several hypothesis tests on a data set, and reporting only those results that are significant. If = P(Type I) = 0.05, and we perform 20 tests on the same data set, we can expect to make 1 Type I error. (0.05 ×20 = 1).
Example: Results Obtained from Fishing Primary driver of $10,000 vehicle and going away for Spring Break are related (P=0.01). Virginity and supporting self through school are related (P = 0.045). Virginity and graduating in four years are related (P = 0.041). Virginity and attending non-football PSU sports events are related (P = 0.016).
Mistake #6 Overstating the results of an observational study. –That is, suggesting that one variable “caused” the differences in the other variable. –As opposed to correctly saying that the two variables are “associated” or “correlated.” Don’t forget that a significant result may be “spurious.”
Example: Misleading Headlines Virgins don’t support themselves through school. Non-virgins too busy to go to non-football PSU sporting events. Non-virgins also too busy to graduate in four years.
Mistake #7 Using a non-random or unrepresentative sample. Includes extending the results of an unrepresentative sample to the population.
Example: Unrepresentative sample Shere Hite wrote a book in 1987 called “Women in Love” 100,000 questionnaires about love, sex, and relationships sent to women’s groups. Only 4,500 questionnaires returned. Entire book devoted to results of survey. Examples: 91% of divorcees initiated the divorce; 70% of women married 5 years committed adultery.
Mistake #8 Failing to use all of the basic principles of experiments, including randomization, blinding, and controlling.