Thoughts on the Future of Statistics Teaching in the light of Big Data

Slides:



Advertisements
Similar presentations
Overview of Inferential Statistics
Advertisements

The World Bank Human Development Network Spanish Impact Evaluation Fund.
Using causal graphs to understand bias in the medical literature.
1 Language of Research Partially Adapted from: 1. The Research Methods Knowledge Base, William Trochim (2006). 2. Methods for Social Researchers in Developing.
Listen to the audio lecture while viewing these slides Psychology 473 Blood and Airborne Pathogens Steven E. Meier, Ph.D. 1 Research Methods How Psychologists.
 Confounders are usually controlled with the “standard” response regression model.  The standard model includes confounders as covariates in the response.
Study Design Data. Types of studies Design of study determines whether: –an inference to the population can be made –causality can be inferred random.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
1 Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University Formulation of a research Using.
The Scientific Method in Psychology.  Descriptive Studies: naturalistic observations; case studies. Individuals observed in their environment.  Correlational.
No criminal on the run The concept of test of significance FETP India.
Review of Research Methods. Overview of the Research Process I. Develop a research question II. Develop a hypothesis III. Choose a research design IV.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Reasoning in Psychology Using Statistics Psychology
Introduction to Validity True Experiment – searching for causality What effect does the I.V. have on the D.V. Correlation Design – searching for an association.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Mediation: The Causal Inference Approach David A. Kenny.
Introduction to Research Design Basic Concepts. Bivariate Experimental Research.
4.1 Statistics Notes Should We Experiment or Should We Merely Observe?
Selecting Valid Statistical Test for Evidence Based Medicine Chapter 1 Overview: 1.1 Why Selecting Valid Statistical Tests are Important? 1.2 Factors to.
What’s the Big Deal about Big Data? 52 nd Annual ACMSE Conference Jennifer Lewis Priestley, Ph.D. Professor of Statistics and Data Science.
Variable selection in Regression modelling Simon Thornley.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
AP Statistics Review Day 2 Chapter 5. AP Exam Producing Data accounts for 10%-15% of the material covered on the AP Exam. “Data must be collected according.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
Copyright © 2014 Pearson Education, Inc. All rights reserved 1.3 Organizing Categorical Data.
Chapter 10 Inference on Two Samples 10.3 Inference About Two Population Proportions.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Research methods. Recap: last session 1.Outline the difference between descriptive statistics and inferential statistics? 2.The null hypothesis predicts.
APPLIED Health Statistics 应用卫生统计 Professor Dr. Linn Carothers California Baptist University Summer, 2016.
Observational studies-the broken scientific method?
Chapter 8 Introducing Inferential Statistics.
The Research Process Formulate a research hypothesis (involves a lit review) Design a study Conduct the study (i.e., collect data) Analyze the data (using.
BIAS AND CONFOUNDING Nigel Paneth.
Intro to Research Methods
Math 6330: Statistical Consulting Class 6
Critically Appraising a Medical Journal Article
One-Sample Inference for Proportions
Hypothesis testing Chapter S12 Learning Objectives
The Centre for Longitudinal Studies Missing Data Strategy
BIAS AND CONFOUNDING
Explanation of slide: Logos, to show while the audience arrive.
How Do Psychologists Ask & Answer Questions?
© 2011 The McGraw-Hill Companies, Inc.
Imagine that we are a team of doctors working for the Centers for Disease Control. We have received several reports from doctors around the country who.
Lecture 3: Introduction to confounding (part 1)
chance Learning impeded by two processes: Bias , Chance
Reasoning in Psychology Using Statistics
Experiments and Quasi-Experiments
Ten things about Experimental Design
Reasoning in Psychology Using Statistics
1 Causal Inference Counterfactuals False Counterfactuals
Research in Psychology
Statistical Inference about Regression
Experiments and Quasi-Experiments
Introduction.
Lesson Using Studies Wisely.
Modeling the Causal Effects of Assisted Reproductive Technology (ART)
I. Introduction and Data Collection C. Conducting a Study
Reasoning in Psychology Using Statistics
What makes a first course in statistics?
Relationship Relation: Association: real and spurious Statistical:
Research Methods & Statistics
Causal Models for Regression Modeling Strategies
Enhancing Causal Inference in Observational Studies
Concepts to be included
Enhancing Causal Inference in Observational Studies
Types of Statistical Studies and Producing Data
Presentation transcript:

Thoughts on the Future of Statistics Teaching in the light of Big Data Louisiana State University - Stephenson Dept. of Entrepreneurship and Decision Sciences Helmut Schneider, PhD, Xuan Wang

Overview Hypothesis Testing Causal Inference Big Data Causal Inference, Miguel A. Hernan, James M. Robins Judea Pearl Causal http://bayes.cs.ucla.edu/home.htm Judea Pearl:Causal Inference: http://bayes.cs.ucla.edu/home.htm Miguel A. Hernan, James M. Robins, Causal Inference

Hypothesis Testing Formulate a Theory State Hypothesis: Ho versus H1 Take a sample Compute statistics Make decision What is the reason for these steps?

Problem Identification Traditional Data Sources Big Data Traditional Data Sources Small volume – low statistical power Limited variety – Biased estimates Low velocity – estimates may not be valid in the future Untapped Sources High volume – high statistical significance - small p value High variety – small bias High velocity – dynamic update of estimates

Statistical Significance versus Practical Significance Accounting faculty research… Auditors take samples…

Statistical Significance versus Practical Significance Cancer Doctors Cite Risks of Drinking Alcohol 12 million women and over a quarter of a million breast cancer cases Statistical significance versus practical significance Risk Ratio 9% versus Risk Difference 0.18 percentage points.

Big Data Implications Big data makes everything statistically significant. This is how the real world works. Implications for teaching statistics Need for students to understand practical significance versus statistical significance.

Causal Inference Correlation is not causation. Statisticians only deal with correlations. But yet they also teach students that there is spurious correlation. Myth: In Big Data correlation is causation. Need for students to learn to judge causation.

Even in Big Data Correlation is not causation! Need for students to learn about causality. 9

When can we Make Causal Claims Randomized Designs Observational Data Well – Defined Treatment Positivity Exchangeability 10

Confounding: Directed Acyclic Graphs (DAG) Treatment Outcome Need for students to learn about confounding and DAGs. Confounder Factor

Statistical Significance versus Unbiased Estimates Causality Unbiased Estimates Timely Estimates Variety Velocity Statistical Significance Volume

Causal Inference

Conclusions Students need to learn about the reasons for using hypothesis testing in todays Big Data environment. Need for students to learn to judge practical significance versus statistical significance. Need for students to learn about DAGs. Students need to learn about methods to establish causation.