Lynn Lethbridge SHRUG November, 2010. What is Bootstrapping? A method to estimate a statistic’s sampling distribution Bootstrap samples are drawn repeatedly.

Slides:



Advertisements
Similar presentations
Hypothesis testing and confidence intervals by resampling by J. Kárász.
Advertisements

Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
Katie Reed EPSSA Methods Workshop. School environment New Latino destinations Immigrant Incorporation Importance of “context of reception” for immigrants’
NLSCY – Elements to take into account. Objectives of the Presentation zEmphasize the key elements to consider of when using NLSCY data.
NLSCY – Suggestions for papers. Objectives of the Presentation zEmphasize proper ways to use the NLSCY data zIdentify the key factors we are looking at.
Alexa Curcio. Would a restriction on height, such as prohibiting males from marrying taller females, affect the height of the entire population?
Complex Surveys Sunday, April 16, 2017.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Clustered or Multilevel Data
Bootstrapping LING 572 Fei Xia 1/31/06.
Tests of Hypothesis [Motivational Example]. It is claimed that the average grade of all 12 year old children in a country in a particular aptitude test.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
How survey design affects analysis Susan Purdon Head of Survey Methods Unit National Centre for Social Research.
Complexities of Complex Survey Design Analysis. Why worry about this? Many government studies use these designs – CDC National Health Interview Survey.
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
PREPARATION, ORGANISATION AND CONDUCTING OF THE POST- ENUMERATION SURVEY IN THE STATE STATISTICAL OFFICE OF THE REPUBLIC OF MACEDONIA Skopje, May 2008.
Measurement Error.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
“Analyzing Health Equity Using Household Survey Data” Owen O’Donnell, Eddy van Doorslaer, Adam Wagstaff and Magnus Lindelow, The World Bank, Washington.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Model Building III – Remedial Measures KNNL – Chapter 11.
Andrey Veykher National Research University «Higher School оf Economics» in St-Petersburg, Russia The representative study of households based on the data.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
Microeconometric Modeling William Greene Stern School of Business New York University.
Using Resampling Techniques to Measure the Effectiveness of Providers in Workers’ Compensation Insurance David Speights Senior Research Statistician HNC.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Lohr 2.2 a) Unit 1 is included in samples 1 and 3.  1 is therefore 1/8 + 1/8 = 1/4 Unit 2 is included in samples 2 and 4.  2 is therefore 1/4 + 3/8 =
Section 5.4 Sampling Distributions and the Central Limit Theorem Larson/Farber 4th ed.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
BUS216 Spring  Simple Random Sample  Systematic Random Sampling  Stratified Random Sampling  Cluster Sampling.
Resampling techniques
Inference for 2 Proportions Mean and Standard Deviation.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Robust Estimators.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Confidence Intervals for Variance and Standard Deviation.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
Introduction to Inference Sampling Distributions.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
SD M= (M)195SD=5.
Matrix Models for Population Management & Conservation March 2014 Lecture 10 Uncertainty, Process Variance, and Retrospective Perturbation Analysis.
Non-parametric Approaches The Bootstrap. Non-parametric? Non-parametric or distribution-free tests have more lax and/or different assumptions Properties:
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Survey Design: Some Implications for.
1/61: Topic 1.2 – Extensions of the Linear Regression Model Microeconometric Modeling William Greene Stern School of Business New York University New York.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
RESEARCH METHODS Lecture 28. TYPES OF PROBABILITY SAMPLING Requires more work than nonrandom sampling. Researcher must identify sampling elements. Necessary.
Synthetic Approaches to Data Linkage Mark Elliot, University of Manchester Jerry Reiter Duke University Cathie Marsh Centre.
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Addis.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
Bootstrap in refinement
RESEARCH METHODS Lecture 28
Multiple Imputation using SOLAS for Missing Data Analysis
Complex Surveys
Test for Mean of a Non-Normal Population – small n
Simulation: Sensitivity, Bootstrap, and Power
Writing the executive summary section of your report
An Online CNN Poll.
Sampling Distribution
Sampling Distribution
Microeconometric Modeling
Chapter 8: Weighting adjustment
Bootstrapping Jackknifing
Cross-validation Brenda Thomson/ Peter Fox Data Analytics
The European Statistical Training Programme (ESTP)
Bootstrapping and Bootstrapping Regression Models
Presentation transcript:

Lynn Lethbridge SHRUG November, 2010

What is Bootstrapping? A method to estimate a statistic’s sampling distribution Bootstrap samples are drawn repeatedly with replacement from the original data From each new sample, the statistic is re-calculated and saved in a dataset (ie 200 bootstraps, 200 statistics) The standard error of the statistic is calculated as the standard deviation of the bootstrap statistics Bootstrapping not used for the point estimate

When to Use Bootstrapping Distribution has no clear analytical solution eg Gini coefficient, poverty intensity Test for sensitivity Complex survey design (not random) eg Statistics Canada surveys are a stratified, multistage design Households within clusters within strata are selected Observations will not be independent – variance calculated the usual way will be underestimated

Two Programs One is ‘traditional’ bootstrapping re-sampling from the original sample The second is bootstrapping using Statistics Canada survey data Statistics Canada does the re-sampling heavy lifting in most of its surveys Use the bootstrap weights provided to calculate the standard error

Program 1 Project where we examined the effect of trade on ‘poverty intensity’ in Canada/US Used state/province level measures in regression analysis Used bootstrapping to measure robustness of results given a different mix of policies Our dataset consists of 61 unique observations of states and provinces. Re-sample to see if results are affected if we had a different make-up of regions

Program 2 Project using the National Longitudinal Survey of Children and Youth (NLSCY) Examined the effect of having a child with disabilities on the health of mothers and fathers Ordered Probit utilizing Statistics Canada NLSCY bootstrap weights to estimate standard errors

Weighting Many survey datasets include sampling weights so results will represent the population The mechanics of using bootstrap weights are the same as for sampling weights Each individual in survey has a sample weight and all the bootstrap weights Re-estimate your model or statistic over and over using a different weight each time

Bootstrap Weight Derivation Re- sampling A Miracle Occurs Bootstrap Weights

Thank you for your attention!