Bootstrap – The Statistician’s Magic Wand

Slides:



Advertisements
Similar presentations
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Advertisements

Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Topic 2: Statistical Concepts and Market Returns
Evaluating Hypotheses
2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.
Lecture 2: Basic steps in SPSS and some tests of statistical inference
Getting Started with Hypothesis Testing The Single Sample.
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
AM Recitation 2/10/11.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
One-sample In the previous cases we had one sample and were comparing its mean to a hypothesized population mean However in many situations we will use.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
Hypotheses tests for means
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Non-parametric Approaches The Bootstrap. Non-parametric? Non-parametric or distribution-free tests have more lax and/or different assumptions Properties:
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Virtual University of Pakistan
Estimating standard error using bootstrap
Inference about the slope parameter and correlation
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
Comparing Systems Using Sample Data
Dependent-Samples t-Test
More on Inference.
CHAPTER 8 Estimating with Confidence
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
Making inferences from collected data involve two possible tasks:
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Size of a hypothesis test
Chapter 5: Introduction to Statistical Inference
Sampling distribution
Hypothesis Testing and Confidence Intervals (Part 2): Cohen’s d, Logic of Testing, and Confidence Intervals Lecture 9 Justin Kern October 17 and 19, 2017.
Testing Hypotheses About Proportions
PCB 3043L - General Ecology Data Analysis.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Statistical Data Analysis
CHAPTER 8 Estimating with Confidence
Simulation-Based Approach for Comparing Two Means
Chapter 8: Inference for Proportions
Sampling and Sampling Distributions
When we free ourselves of desire,
Comparing Two Proportions
More on Inference.
CHAPTER 8 Estimating with Confidence
Testing Hypotheses about Proportions
Elementary Statistics
Discrete Event Simulation - 4
Comparing Two Proportions
Reasoning in Psychology Using Statistics
CHAPTER 8 Estimating with Confidence
Geology Geomath Chapter 7 - Statistics tom.h.wilson
Testing Hypotheses About Proportions
Statistical Data Analysis
CHAPTER 8 Estimating with Confidence
What are their purposes? What kinds?
Reasoning in Psychology Using Statistics
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Reasoning in Psychology Using Statistics
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Simulation Berlin Chen
Presentation transcript:

Bootstrap – The Statistician’s Magic Wand Saharon Rosset

An abstract view of statistics There is a “world” (=unknown distribution) F We observe some data from the world, say 100 heights (z) and weights (y) of random people We want to learn about some property of the world F, e.g.: Mean of height Correlation between height and weight Variance of the empirical correlation between height and weight

Standard statistical methodology Find a way to estimate the property of F of interest directly from the data Mean height estimated by average Correlation between height and weight estimated by empirical correlation How do we estimate the variance of the correlation? There are some formulae under some assumptions, but it gets complicated Instead, we want to invent a general approach that will allow estimating every property of F relatively easily (hopefully, also well)

The general Bootstrap recipe We are interested in some property of the world θ=𝑡(𝐹) We create an “alternative world” 𝐹 (usually using our data), and in it we estimate θ =𝑡 𝐹 Usually simply done empirically by drawing data from this world and applying t The main wisdom lies in how to build the Bootstrap world 𝐹 so that it is “similar” to 𝐹 in the ways that matter to us Secondary problem: how to perform the estimation in the bootstrap world, usually straight forward

Graphical representation Real world Bootstrap world θ =𝑡 𝐹 Dist. 𝐹 Data X Dist. 𝐹 Data X* Determine 𝐹 θ=𝑡 𝐹 Statistic s(X) θ =𝑡 𝐹 Statistic s(X*)

Is Bootstrap important in practice?

Example: variance of empirical correlation F is the bivariate distribution of z=height and y=weight, we are given data X with 100 pairs of (z,y) The statistic of interest is 𝑠 𝑋 = 𝑐𝑜𝑟 𝑧,𝑦 The property of F we are interested in is 𝜃= 𝑣𝑎𝑟 𝐹 𝑠 Bootstrap approach: Build bootstrap world 𝐹 Repeatedly draw “bootstrap samples” X* from 𝐹 Repeatedly estimate s(X*) from each sample Use these estimates to empirically estimate 𝜃 =𝑣𝑎𝑟 𝐹 𝑠( 𝑋 ∗ ) in the boostrap world This is your estimate of 𝜃 in the real world

How to build 𝐹 ? The “double arrow” is the key to designing a bootstrap algorithm The most standard approach: use the empirical distribution of the data Drawing X* is drawing 100 pairs (z*,y*) with return from the original dataset This is commonly referred to as “bootstrap sampling” or “nonparametric bootstrap” But this is not the only approach, and often not the best one!

Parametric Bootstrap example Assume we “know” that 𝐹 (joint dist. of height, weight) is bivariate normal Then it makes sense to make 𝐹 bivariate normal, with parameters estimated from the data X Then we can repeat exactly the same stages of drawing X*, and estimating the variance empirically

Concrete example Let’s choose 𝐹 of height and weight to be bi-normal: 𝑍 𝑌 ~ 𝑁 175 75 , 100 50 50 50 We start by drawing 105 random samples of 100 and observing the distribution of 𝑠 𝑋 = 𝑐𝑜𝑟 𝑧,𝑦 , in particular we get 𝜃= 𝑣𝑎𝑟 𝐹 𝑠 = 0.00261 Now we want to try different Bootstrap approaches for estimating 𝜃

Approach 1: standard non-parametric Bootstrap Define 𝐹 to be the empirical distribution of X, then: Sample many X* (Bootstrap samples) Calculate 𝑠 ∗ 𝑋 ∗ = 𝑐𝑜𝑟 𝑧 ∗ , 𝑦 ∗ for each X* Estimate the variance of s by the empirical variance of s* In simulation we can repeat this whole exercise many times to get a distribution of Bootstrap estimates

Approach 2: parametric Bootstrap using normal distribution Use X to estimate mean and covariance of 𝐹, assuming it is normal, and define 𝐹 to be this normal distribution. The rest proceeds as before: Sample many X* from this bi-normal distribution (parametric Bootstrap samples) Calculate 𝑠 ∗ 𝑋 ∗ = 𝑐𝑜𝑟 𝑧 ∗ , 𝑦 ∗ for each X* Estimate the variance of s by the empirical variance of s*

Which one will be better here?

Does Bootstrap always work? Of course not! From what we already know it’s clear that if we fail to build 𝐹 so that θ =𝑡 𝐹 is “similar” to θ=𝑡(𝐹) then our approach is useless Can be a result of wrong assumptions on 𝐹 used in building 𝐹 Can easily devise examples where no Bootstrap approach will give reasonable results Still, the usefulness of properly implemented Bootstrap is very general and applies to almost any “reasonable” problem we encounter

Hypothesis testing with Bootstrap Recall the components of a hypothesis testing problem: Null hypothesis: 𝐻 0 : 𝜃= 𝜃 0 Test statistic z=s(X) Performing a test entails calculating quantities like p−value= 𝑃 𝐻 0 (𝑠 𝑋 >𝑧) and rejecting if it is small The p-value for a given z is also a property of 𝐹, but how can we use the Bootstrap to estimate it? If 𝐻 0 uniquely defines the distribution, then it’s trivial, a standard simulation exercise But if 𝐻 0 contains many possible 𝐹’s, we can implement the bootstrap paradigm: Choose 𝐹 as a member of 𝐻 0 that is “consistent with our data”, calculate the p value under this distribution

Inference on phylogenetic trees Dataset of malaria genetic sequences from different organisms (11 species, sequences of length 221): Result of applying standard phylogenetic tree learning approach: Our inference goal: asses confidence in the 9-10 clade (subtree) – is it strongly supported by the data?

Felsenstein’s Bootstrap of Phylogenetic trees Given this phylogenetic tree built on this dataset, Felsenstein wanted to get an answer to questions like: how certain am I that subtree 𝑇 0 (say, 9-10) is “real” (i.e. exists in the world and not just my data) He suggested using the Bootstrap as follows: Draw bootstrap samples of markers Build tree on each sample (all species, sampled markers) Use the % of time we get the subtree 𝑇 0 as “confidence” in this subtree

Is this Bootstrap legit? We want to know whether the subtree exists in 𝐹 so we estimate this by % of time it exists in data drawn from 𝐹 This is not exactly a Bootstrap recipe (details are not critical) But assuming it is a Bootstrap approach, is it a good one? Not at all, because 𝐹 was built based on the sample whose “best tree” contains the subtree This basically means that 𝐹 contains the subtree, so we know we are getting over-optimistic results A more correct formulation of this question is as a hypothesis test of 𝐻 0 :𝑇𝑟𝑒𝑒 𝑑𝑜𝑒𝑠 𝑛𝑜𝑡 𝑐𝑜𝑛𝑡𝑎𝑖𝑛 𝑇 0 If we reject 𝐻 0 we can conclude that 𝑇 0 is reliable

Efron’s solution(s) In a beautiful paper, Efron et al. (1996, PNAS) reanalyze this problem and show: That under some (quite convoluted) assumptions Felsenstein’s approach can be considered a legitimate Bootstrap That without any convoluted arguments (but with some complicated math and geometry), an appropriate Bootstrap can be devised for the hypothesis testing view of the problem

Efron’s hypothesis testing view First task: Build a Bootstrap world 𝐹 where: 𝐻 0 holds 𝐹 is as similar as possible to the empirical distribution of our data Then we can test 𝐻 0 by examining what percentage of the time 𝑇 0 gets selected in this world If it is smaller than 5%, we reject 𝐻 0 at level 0.05 and conclude 𝑇 0 is well supported The challenge is the first task, and this is what Efron concentrates on

A peek into Efron’s approach

Comparing Bootstrap results of Felsenstein and Efron We recall that Felsenstein’s method gave 96.5% “confidence” for the 9-10 clade Efron is rewarded for his hard work with a result that 93.8% of trees in his Bootstrap world do not contain the 9-10 clade His Bootstrap p-value for 𝐻 0 is 0.062 The results are only slightly different, but if we treat 95% confidence / 5% p-value as the holy grail then we conclude: According to Felsenstein we are “confident” in this clade According to Efron we cannot reject that this clade is a coincidence

Summary Bootstrap is an extremely general and flexible paradigm for statistical inference Allows us to handle complex situations with minimal assumptions and without complicated math Doing theory (and also devising solutions for some problems) can get very complicated, though Has been widely influential in science and industry However, despite the conceptual simplicity it is often misunderstood and misapplied (well beyond Felsenstein)

Thanks! saharon@post.tau.ac.il