Download presentation
Presentation is loading. Please wait.
Published byTeguh Budiman Modified over 6 years ago
1
Statistical Fallacies Catastrophes & Contributions,
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers Analyzing Numbers in the News 2013 1 March 15 May 2008 Statistical Fallacies Catastrophes & Contributions, [Abstract ] Milo Schield, Augsburg College Member: International Statistical Institute US Rep: International Statistical Literacy Project Director, W. M. Keck Statistical Literacy Project VP. National Numeracy Network August 1, 2016 First big idea: Statistical educators at JSM are an extremely biased sample compared to their student abilities and majors. 2008SchieldNNN6up.pdf 2013Schield-MBAA 1 1 1
2
Core Concepts in Intro Stats
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 2 Core Concepts in Intro Stats McKenzie (2004): Survey of Educators (2007) Big Ideas in Statistics Garfield & Ben Zvi (2008): Big Ideas of Statistics Gould-Miller-Peck (2012). Five Big Ideas (2013): 10 Big Ideas Stat110 Stigler (2016): Seven pillars of statistical wisdom 2013Schield-MBAA 2
3
Garfield & Ben Zvi (2008) Big Ideas of Statistics
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 3 Garfield & Ben Zvi (2008) Big Ideas of Statistics Reasoning about Data Reasoning about Models & Modeling Reasoning about Distribution Reasoning about Center Reasoning about Variability Reasoning about Comparing Groups Reasoning about Samples & Sampling Reasoning about Statistical Inference Reasoning about Covariation 2013Schield-MBAA 3
4
Ambiguity of Importance
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 4 Ambiguity of Importance Important as: a topic (randomness) or a claim: ME ~ 1/sqrt(n) A source for the ideas/relations in a discipline; a source of extensive social benefit or cost; or a source of cognitive misunderstanding (fallacy). In this talk, importance is a claim involving a fallacy, or extensive social benefit or cost (contribution or catastrophe) 2013Schield-MBAA 4
5
Ambiguity of Importance
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 5 Ambiguity of Importance Topic (randomness) or a claim: ME ~ 1/sqrt(n) This paper focuses on claims or relationships having substantial social or cognitive consequences. 2013Schield-MBAA 5
6
The Most Dangerous Equation
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 6 The Most Dangerous Equation In Picturing the Uncertain World, Howard Wainer argued that de Movire’s equation was the most dangerous equation in the world – among those that are unknown or ignored. Wainer gave six great examples. 2013Schield-MBAA 6
7
Utts (2003) 7 Things Citizens Should Know
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 7 Utts (2003) 7 Things Citizens Should Know Association vs. causation: Clinical trial (random assign) vs. observational study (confounding) Statistical significance vs practical importance ‘No effect’ vs ‘no significant effect’ Types/sources of bias Coincidences can be common Confusion of the inverse Normal vs. average 2013Schield-MBAA 7
8
#1: Statistics are Numbers in Context
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 8 #1: Statistics are Numbers in Context “Statistics are just numbers” fallacy. Numbers are facts – and so are statistics. Isaacson (2012): Where do Statistics come from. 2013Schield-MBAA 8
9
#1A: Statistical Fallacies Probability Fallacies
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 9 #1A: Statistical Fallacies Probability Fallacies Confusion of the inverse: P(A|B) = P(B|A) C.f., Medical Tests: Chance that a diseased person will test positive vs. chance that a person testing positive has the disease. Conjunction fallacy: P(A&B) > P(A) Chance Linda is a bank teller and active feminist is greater than being a bank teller. P(A&B |C) > P(A |B&C): Three-factor fallacy 2013Schield-MBAA 9
10
#1B: Statistical Fallacies Individuals vs. Groups
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 10 #1B: Statistical Fallacies Individuals vs. Groups Individual fallacy: From individuals to group The rich are more likely to vote Republican than the poor. Yet richer states tend to vote Democrat. Ecological fallacy: From group to individuals 3. Simpson’s Paradox From groups to subgroups or vice-versa. 2013Schield-MBAA 10
11
#1B2: Statistical Fallacies Ecological Fallacy
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 11 #1B2: Statistical Fallacies Ecological Fallacy . 2013Schield-MBAA 11
12
#1B3: Statistical Fallacies Simpson’s Paradox (Before)
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 12 #1B3: Statistical Fallacies Simpson’s Paradox (Before) . 2013Schield-MBAA 12
13
#1B3: Statistical Fallacies Simpson’s Paradox (After)
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 13 #1B3: Statistical Fallacies Simpson’s Paradox (After) . 2013Schield-MBAA 13
14
#1D: Statistical Fallacies Coincidence-Causation
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 14 #1D: Statistical Fallacies Coincidence-Causation Any statistically-significant event/connection is evidence of causation. Coincidence is too unlikely to be just chance. Law of Very Large Numbers (Qual/Quant) * Unlikely is almost certain given enough tries * If P = 1/N, event is more likely than not in N tries 2013Schield-MBAA 14
15
#1D: Statistical Fallacies Coincidence-Causation
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 15 #1D: Statistical Fallacies Coincidence-Causation Law of Truly Large Numbers is “sometimes called the Jeane Dixon effect (see also Postdiction)”. It holds that the more predictions a psychic makes, the better the odds that one of them will "hit". Thus, if one comes true, the psychic expects us to forget the vast majority that did not happen.” 2013Schield-MBAA 15
16
#1D: Statistical Fallacies Coincidence-Causation
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 16 #1D: Statistical Fallacies Coincidence-Causation A Swedish study in 1992 looked at the incidence of poor health (800 ailments) among those living close to high-voltage power lines over a 25-year period. The study found that the incidence of childhood leukemia was four times higher among those that lived closest to the power lines. They failed to compensate for the look-elsewhere effect; in any collection of 800 random samples, it is likely that at least one will be at least 3 standard deviations above the expected value, by chance alonehttps://en.wikipedia.org/wiki/Law_of_truly_large_numbers 2013Schield-MBAA 16
17
#1A: Statistical Fallacies Non-Traditional: Confounding
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 17 #1A: Statistical Fallacies Non-Traditional: Confounding “Statistics are just numbers” Statistical significance is permanent! Permanent in repeated trials Permanent regardless of context Any/every observational association can be nullified/reversed by an unknown confounder 2013Schield-MBAA 17
18
#2A: Statistical Principles
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 18 #2A: Statistical Principles De Movire’s equation SE: independent of size of population. SE ~ 1/Sqrt(n) Applications: Hot spots, coincidences, Birthday problem. 2013Schield-MBAA 18
19
#2B: Statistical Principles
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 19 #2B: Statistical Principles Law of Very Large Numbers. Qualitative. The unlikely is almost certain given enough tries. Quantitative: An outcome is more likely than not given N tries when P = 1/N. Applications: Hot spots, coincidences, Birthday problem. 2013Schield-MBAA 19
20
#2B: Algebra in 8th Grade is Better
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 20 #2B: Algebra in 8th Grade is Better Overall, college attendance was 35% more prevalent among Algebra 8 students (62%) than Math 8 (46%). P < .001 For students with similar math scores, college attendance was 32% more prevalent among Algeba 8 students (45%) than Math 8 (34%) but the difference in rates was not statistically significant. (samples of 128 vs. 136). 2013Schield-MBAA 20
21
#2B: Keyes “Seven Countries” Study
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 21 #2B: Keyes “Seven Countries” Study 1958: Countries with the highest fat consumption had the most heart disease . This study supported the “Fat is bad” health recommendations and the introduction of “low- fat” foods (which tended to be “high-carb” foods) 2013Schield-MBAA 21
22
#2B: Plausibility versus provability
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 22 #2B: Plausibility versus provability Most journalistically significant findings are based on observational studies and involve associations that are plausible. But, 80-90% of the claims coming from supposedly scientific studies in major journals fail to replicate. They can’t be scientifically proven. research-today-a-lot-thats-published-is-junk/#482fd39520b8 2013Schield-MBAA 22
23
#2B: Benefit of Observational Studies
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 23 #2B: Benefit of Observational Studies “Observational studies are only good for generating hypotheses.” flawed-studies/ No! Good observational studies are needed where randomization and treatment is impossible, unethical or unfeasible. 2013Schield-MBAA 23
24
Cornfield Conditions In replying to Fisher, Cornfield proved a necessary condition for a confounder to nullify (or reverse) an observed association. “Cornfield's minimum effect size is as important to observational studies as is the use of randomized assignment to experimental studies.” Schield (1999) Simpson’s Paradox & the Cornfield Conditions
25
Contribution: The Cornfield Conditions
Data showed that smokers were 10 times as likely to develop lung cancer as were non-smokers. Some statisticians wanted to support the claim that smoking “caused” lung cancer. Sir Ronald Fisher (1958) noted that “association was not causation” and that there was a difference (factor of two) in smoking preference between fraternal and identical twins. Cornfield et al (1959) argued that to nullify or reverse the observed association, the relative risk of a confounder must exceed the relative risk of that association. Fisher never replied.
26
Stratification Two-Way Half Tables
Analyzing Numbers in the News StatLit for Managers Statistical Literacy for ManagersStatLit for Managers 15 May 2008 2013 1 March Stratification Two-Way Half Tables Patient Died “Good” “Poor” TOTAL City Hospital 1% 6% 5.5% Rural Hospital 2% 7% 3.5% 1.5% 6.5% Patient at City is 2 pts more likely to die that at Rural. Patient in Poor condition is 5 pts more likely to die than is a Patient in Good condition. Association with Outcome: Confounder > Predictor 2008SchieldNNN6up.pdf 2013Schield-MBAA 26 26 26
27
Stratification Two-Way Half Tables
Analyzing Numbers in the News StatLit for Managers Statistical Literacy for ManagersStatLit for Managers 15 May 2008 2013 1 March Stratification Two-Way Half Tables Patient Died “Good” “Poor” TOTAL City Hospital 1% 6% 3% Rural Hospital 2% 7% 1.2% 3.8% 2.7% Patient at City is 2 pts more likely to die that at Rural. Patient in Poor condition is 5 pts more likely to die than is a Patient in Good condition. Association with Outcome: Confounder > Predictor 2008SchieldNNN6up.pdf 2013Schield-MBAA 27 27 27
28
Cornfield Condition for Nullification or Reversal
Schield (1999) based on realistic data
29
Cornfield Condition for Nullification or Reversal
Schield (2004) IASE
30
Cornfield Condition for Nullification or Reversal
An association is nullified or reversed only if confounder (patient condition) has a stronger association with the outcome (death) than does the predictor (hospital). predictor (hospital) has a stronger association with the confounder (patient condition) than with the outcome (death).
31
Effect Sizes: Relative Risk
Obese vs. non-Obese Chambers and Wakley (2002). Obesity and Overweight Matters in Primary Care
32
Confounder Distribution: Simple One-Parameter Model
Assume: RR of confounders is distributed exponentially with a minimum RR of one and a mean RR of two.
33
Effect Sizes: Relative Risk 95% Confounder Resistant
Obese vs. non-Obese
34
Contributions of Statistics to Human Knowledge
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 34 Contributions of Statistics to Human Knowledge . 2013Schield-MBAA 34
35
Statistical Literacy for ManagersStatLit for Managers
2013 1 March 35 #2B: More Math is Better Math is a gatekeeper. The highest math class a high school senior takes has a major influence on both college acceptance and college choice. ” President, Calif School Boards Association, students who completed algebra in the eighth grade stayed in the mathematics pipeline longer and attended college at greater rates than those who did not. 2013Schield-MBAA 35
36
#1C: Statistical Fallacies Lieberson (1985)
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 36 #1C: Statistical Fallacies Lieberson (1985) the selectivity problem due to pseudo-controls, contamination of control group by treatment, asymmetric causation (irreversible processes), Using high R2 as goal of a good explanation Presuming that adding more control variables takes one closer to the truth. Lieberson, S. (1985). Making It Count: The Improvement of Social Research and Theory. University of California Press. 2013Schield-MBAA 36
37
#2B: Smaller Class Sizes are Better
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 37 #2B: Smaller Class Sizes are Better … 2013Schield-MBAA 37
38
Augsburg Student Survey: Seven Most Important
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 38 Augsburg Student Survey: Seven Most Important 1 All sources of influence (Take CARE) 2 Confounding 2 Hypothetical thinking: confounders, definitions. 4 Statistics are more than numbers. 5 Association-causation & Randomness (Luck vs. skill) 5 Bias: Placebo, Single blind; double blind 5 Named Ratios grammar; Percent, Percentages, Rates 2013Schield-MBAA 38
39
Statistical Literacy for ManagersStatLit for Managers
2013 1 March 39 Conclusion Introductory statistics must be re-engineered: Allow for differences in students aptitudes Allow for difference in student interests Increase focus on multivariate & confounding Increase focus on Context: Where do Stats come from? 2013Schield-MBAA 39
40
Statistical Literacy for ManagersStatLit for Managers
2013 1 March 40 References McKenzie, John, Jr. (2004) . Teaching the Core Concepts. ASA Schield, M. (2015). Statistical Inference for Managers. ASA Schield, M. (2014). Two Big Ideas for Teaching Big Data: ECOTS. Berendsen, Hadlich and van Amersfoort (2011). Is Conjunction Fallacy Really a Fallacy? content/uploads/2010/12/Looking-at-Linda.pdf 2013Schield-MBAA 40
41
McKenzie (2004) Core Concepts in Intro Stats
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 41 McKenzie (2004) Core Concepts in Intro Stats McKenzie (2004) asked statistical educators to pick the top-three core concepts in intro statistics: 75% Variation 31% Association vs. causation 25% Hypothesis tests 24% Sampling distribution 22% Confidence intervals 14% Randomness and statistical significance %: Percentage of votes by Statistical Educators Sample size: % ME = 12 percentage points 2013Schield-MBAA 41
42
Verkuilen@UIUC (2013) Big Ideas in Probability/Stats
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 42 (2013) Big Ideas in Probability/Stats Law of Large Numbers Central Limit Theorem Additional Big Ideas: de Moivre's Equation The Gauss–Markov theorem Cochran's theorem. 2013Schield-MBAA 42
43
Chen@Harvard (2014) Big Ideas in Stat 111 Theory
Statistical Literacy for ManagersStatLit for Managers StatLit for Managers 2013 1 March 43 (2014) Big Ideas in Stat 111 Theory Bayes rule and Data generation Likelihood functions Point estimators: MLE, MOME, MAP Interval estimates: Exact, Asymptotic, etc. Calculus: Transformation, Lagrange, MLE Sufficient statistics; pivotal quanitites Bias, Variance, Information Asymptotic behavior: MLE, Bayes posterior Power & Hyp. Testing. Sample size 2013Schield-MBAA 43
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.