Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thoughts on the Future of Statistics Teaching in the light of Big Data

Similar presentations


Presentation on theme: "Thoughts on the Future of Statistics Teaching in the light of Big Data"— Presentation transcript:

1 Thoughts on the Future of Statistics Teaching in the light of Big Data
Louisiana State University - Stephenson Dept. of Entrepreneurship and Decision Sciences Helmut Schneider, PhD, Xuan Wang

2 Overview Hypothesis Testing Causal Inference Big Data
Causal Inference, Miguel A. Hernan, James M. Robins Judea Pearl Causal Judea Pearl:Causal Inference: Miguel A. Hernan, James M. Robins, Causal Inference

3 Hypothesis Testing Formulate a Theory State Hypothesis: Ho versus H1
Take a sample Compute statistics Make decision What is the reason for these steps?

4 Problem Identification Traditional Data Sources
Big Data Traditional Data Sources Small volume – low statistical power Limited variety – Biased estimates Low velocity – estimates may not be valid in the future Untapped Sources High volume – high statistical significance - small p value High variety – small bias High velocity – dynamic update of estimates

5 Statistical Significance versus Practical Significance
Accounting faculty research… Auditors take samples…

6 Statistical Significance versus Practical Significance
Cancer Doctors Cite Risks of Drinking Alcohol 12 million women and over a quarter of a million breast cancer cases Statistical significance versus practical significance Risk Ratio 9% versus Risk Difference 0.18 percentage points.

7 Big Data Implications Big data makes everything statistically significant. This is how the real world works. Implications for teaching statistics Need for students to understand practical significance versus statistical significance.

8 Causal Inference Correlation is not causation.
Statisticians only deal with correlations. But yet they also teach students that there is spurious correlation. Myth: In Big Data correlation is causation. Need for students to learn to judge causation.

9 Even in Big Data Correlation is not causation!
Need for students to learn about causality. 9

10 When can we Make Causal Claims
Randomized Designs Observational Data Well – Defined Treatment Positivity Exchangeability 10

11 Confounding: Directed Acyclic Graphs (DAG)
Treatment Outcome Need for students to learn about confounding and DAGs. Confounder Factor

12 Statistical Significance versus Unbiased Estimates
Causality Unbiased Estimates Timely Estimates Variety Velocity Statistical Significance Volume

13 Causal Inference

14 Conclusions Students need to learn about the reasons for using hypothesis testing in todays Big Data environment. Need for students to learn to judge practical significance versus statistical significance. Need for students to learn about DAGs. Students need to learn about methods to establish causation.


Download ppt "Thoughts on the Future of Statistics Teaching in the light of Big Data"

Similar presentations


Ads by Google