How to Effectively Communicate Commonly misinterpreted Statistical terms in experimentation AMAZON IPC LAB.

Slides:



Advertisements
Similar presentations
Claude Beigel, PhD. Exposure Assessment Senior Scientist Research Triangle Park, USA Practical session metabolites Part II: goodness of fit and decision.
Advertisements

II. Potential Errors In Epidemiologic Studies Random Error Dr. Sherine Shawky.
Design of Experiments Lecture I
Validity (cont.)/Control RMS – October 7. Validity Experimental validity – the soundness of the experimental design – Not the same as measurement validity.
THE NEWCASTLE CRITICAL APPRAISAL WORKSHEET
Psychology 202b Advanced Psychological Statistics, II February 10, 2011.
Chapter Seventeen HYPOTHESIS TESTING
McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. CHAPTER 2 Tools of Positive Analysis.
Stat 301 – Day 15 Comparing Groups. Statistical Inference Making statements about the “world” based on observing a sample of data, with an indication.
Study Design Data. Types of studies Design of study determines whether: –an inference to the population can be made –causality can be inferred random.
Statistics. Overview 1. Confidence interval for the mean 2. Comparing means of 2 sampled populations (or treatments): t-test 3. Determining the strength.
Chapter Eighteen MEASURES OF ASSOCIATION
TOOLS OF POSITIVE ANALYSIS
Measures of Association Deepak Khazanchi Chapter 18.
Chapter 1 Introduction to the Scientific Method Can Science Cure the Common Cold?
Standard error of estimate & Confidence interval.
Chapter 11 Simple Regression
Chapter 8 Introduction to Hypothesis Testing
Things that I think are important Chapter 1 Bar graphs, histograms Outliers Mean, median, mode, quartiles of data Variance and standard deviation of.
T tests comparing two means t tests comparing two means.
Introduction to Statistical Inference Chapter 11 Announcement: Read chapter 12 to page 299.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
POSC 202A: Lecture 12/10 Announcements: “Lab” Tomorrow; Final ed out tomorrow or Friday. I will make it due Wed, 5pm. Aren’t I tender? Lecture: Substantive.
Ch. 2 Tools of Positive Economics. Theoretical Tools of Public Finance theoretical tools The set of tools designed to understand the mechanics behind.
Determination of Sample Size: A Review of Statistical Theory
You want a mind open enough to accept radical new ideas, but not so open that your brains fall out The Scientific Attitude: A "Baloney Detection Kit“ Some.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.
Section 3.3: The Story of Statistical Inference Section 4.1: Testing Where a Proportion Is.
Research Methods in Psychology Chapter 2. The Research ProcessPsychological MeasurementEthical Issues in Human and Animal ResearchBecoming a Critical.
MSA830: Introduction Petter Mostad
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. CHAPTER 2 Tools of Positive Analysis.
1.What is Pearson’s coefficient of correlation? 2.What proportion of the variation in SAT scores is explained by variation in class sizes? 3.What is the.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
Engineering Statistics Design of Engineering Experiments.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Dr.Theingi Community Medicine
Yandell - Econ 216 Chap 1-1 Chapter 1 Introduction and Data Collection.
Mail Call Us: , , Data Science Training In Ameerpet
P value and confidence intervals
Statistical Inference
Ch. 2 Tools of Positive Economics
REGRESSION (R2).
CHAPTER 10 Comparing Two Populations or Groups
Section 2: Science as a Process
CHAPTER 10 Comparing Two Populations or Groups
Quantitative Methods PSY302 Quiz Chapter 9 Statistical Significance
BUS 308 HELPS Education for Service-- bus308helps.com.
Ten things about Inference
CHAPTER 10 Comparing Two Populations or Groups
Statistical significance using p-value
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Positive analysis in public finance
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Sample Sizes for IE Power Calculations.
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Type I and Type II Errors
Statistical Power.
CHAPTER 10 Comparing Two Populations or Groups
Multivariate Relationships
Presentation transcript:

How to Effectively Communicate Commonly misinterpreted Statistical terms in experimentation AMAZON IPC LAB

“To invent you have to experiment, and if you know in advance that it's going to work, it's not an experiment. Most large organizations embrace the idea of invention, but are not willing to suffer the string of failed experiments necessary to get there.” – Jeff Bezos Culture of experimentation at Amazon

Amazon’s Supply CHAIN Treatment effect:

Experimentation IN SUPPLY CHAIN Randomized Controlled Trials in production Customers Teams in Supply Chain Optimization Technologies (SCOT) and Retail organizations within Amazon Software Developers, Product Managers, Research Scientists, Senior Leaders A/B Testing ?

Measuring treatment effects Treatment Effect: Impact of a new idea, or changes in supply chain. Method: OLS regression What do we provide to the customers? Treatment effect estimates 95% confidence intervals

Challenges Lack of intuitive explanation of statistical theories for customers Customers’ tendency to interpret results based on point estimates, without considering uncertainty measures “The tests themselves give no final verdict, but as tools help the worker who is using them to form his final decision.” - Neyman and Pearson

Commonly misinterpreted terms Correlation ⍯ causation Spurious Correlations Concept of unobserved factors or confounders Absence of evidence ⍯ evidence of absence Goal of the experiment ⍯ achieve significance Interim Results Peeking Influence of early adopters on results

Commonly misinterpreted terms Interpretation of Uncertainty Power and Minimum Detectable Effect p-value 95% Confidence Intervals Difference “on-average” Estimate = Expected value of unknown population parameter Only one of the potential outcome observed for each individual Simpson’s Paradox Opposite directions for observed effect for all subjects as a single group vs. separately for each group

Ongoing WORK for effective communication Technical FAQ page Visualization and dashboards Bayesian Probabilistic Statements Introduce Type S (sign) and Type M (magnitude) errors in results Training and awareness Certification and bar-raisers

Questions? Thank you!