Simulation: Sensitivity, Bootstrap, and Power

Slides:



Advertisements
Similar presentations
Probability models- the Normal especially.
Advertisements

Hypothesis testing and confidence intervals by resampling by J. Kárász.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Sampling Distributions (§ )
Experimental Design, Statistical Analysis CSCI 4800/6800 University of Georgia Spring 2007 Eileen Kraemer.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
Statistics 101 Class 9. Overview Last class Last class Our FAVORATE 3 distributions Our FAVORATE 3 distributions The one sample Z-test The one sample.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Today Concepts underlying inferential statistics
INTRODUCTION TO Machine Learning 3rd Edition
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
4.1 Introducing Hypothesis Tests 4.2 Measuring significance with P-values Visit the Maths Study Centre 11am-5pm This presentation.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Statistics PSY302 Quiz One Spring A _____ places an individual into one of several groups or categories. (p. 4) a. normal curve b. spread c.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Inference for 2 Proportions Mean and Standard Deviation.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Experimental Psychology PSY 433 Appendix B Statistics.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Power and Sample Size Anquan Zhang presents For Measurement and Statistics Club.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Bootstrap Event Study Tests Peter Westfall ISQS Dept. Joint work with Scott Hein, Finance.
Chapter 10 Statistical Inference for Two Samples More than one but less than three! Chapter 10B < X
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
1 Probability and Statistics Confidence Intervals.
T tests comparing two means t tests comparing two means.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.3 Other Ways of Comparing Means and Comparing Proportions.
AP STATISTICS LESSON 11 – 1 (DAY 2) The t Confidence Intervals and Tests.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Methods of Presenting and Interpreting Information Class 9.
Estimating standard error using bootstrap
Statistical Inference
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
Lecture Nine - Twelve Tests of Significance.
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Prepared by Lloyd R. Jaisingh
Statistical inference: distribution, hypothesis testing
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
CPSC 531: System Modeling and Simulation
Inferences On Two Samples
Comparing Two Proportions
Test for Mean of a Non-Normal Population – small n
Hypothesis Tests for a Population Mean in Practice
Stochastic Hydrology Hydrological Frequency Analysis (II) LMRD-based GOF tests Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Statistical Inference for the Mean Confidence Interval
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Elementary Statistics
Comparing Two Proportions
Chapter 9: Hypothesis Tests Based on a Single Sample
(or why should we learn this stuff?)
Categorical Data Analysis Review for Final
CHAPTER 6 Statistical Inference & Hypothesis Testing
Analytics – Statistical Approaches
Sampling Distributions (§ )
Statistical Inference for the Mean: t-test
Bootstrapping and Bootstrapping Regression Models
Inference Concepts 1-Sample Z-Tests.
Introductory Statistics
Presentation transcript:

Simulation: Sensitivity, Bootstrap, and Power Nathaniel MacNell EPID 799C Fall 2017

Overview Simulations: Why? Simulations: How? Simulations: In Practice Bootstrap (confidence intervals) Sensitivity Analysis Power Calculations

[Statistical] Simulations: Why? We study populations and characterize their attributes using (probability) distributions We use the concept of randomness to stand in for lack of information about a population. We also use randomness as a tool for causal inference (in the model of the randomized experiment). We can use random sampling to simulate variety of (parametric) statistical processes.

Simulations: How? A 4-step process: Characterize. (Re)sample. Calculate statistic. Summarize.

1. Characterize Determine the distribution(s) of interest. In R, this is represented as a vector. An empirical distribution (for example, the values of height for each person in a research study) is just the data itself. A parametric distribution (for example, a normal distribution of height with mean 68 inches and standard deviation 4 inches) can be constructed from statistics or created a priori). You can build associations into the data.

2. (Re)sample Use the sample() function to draw a subset from the empirical distribution at random. Alternatively, use built-in functions like rnorm() to sample from a parametric distribution A sample of size 1 simulates a random variable. A sample of size >1 simulates a random sample. Most applications require sampling with replacement unless you are interested in a permutation-type problem. [Typically, for large samples there isn’t much of a difference].

3. Calculate Statistic Write code to calculate the statistic of interest. Recall that statistic is just a general name for any summary of the data (including multivariate statistics): Mean, median, min, max, of a sample. Measures of occurrence (risk, odds, Measures of association (ratios or differences between other statistics) Measures comparing to a baseline or null hypothesis (p-values, confidence intervals, etc.)

4. Summarize We now need to calculate statistics for the statistic of interest. In other words, we want to characterize the distribution of (resampled) distributions: Mean of the sample means. Standard deviation of the sample mean. Mean of the odds ratio. Confidence interval for the odds ratio. Proportion of the distribution above a threshold (e.g. power, signifigance)

Example 1: Bootstrap We can use resampling to estimate univariate statistics; this is particularly useful when the calculation is difficult or not straightforward.

Example 2: Sensitivity Analysis We can use edited copies our dataset, consistent with different assumptions (or typically, violations of standard assumptions), to assess the degree to which our results are affected by those assumptions. Measurement error Misclassification Covariance Interference Adherence Residual confounding

Example 3: Power We can use parametric distributions to estimate the probability of rejecting the null hypothesis or characterize the expected confidence intervals resulting from a specific set of assumptions. Useful for complex designs; i.e. essentially all study designs you will work on (few dissertations have the luxury of being randomized controlled trials). (As in any power analysis) the outputs from your simulation are only as good as the assumptions you have made and how realistic they are.

Lab: Practice Simulations