Download presentation
Presentation is loading. Please wait.
1
Estimates of Bias & The Jackknife
”An Introduction to the Bootstrap”, Bradley Efron & Robert J. Tibshirani, Chapters 10-11 M.Sc. Seminar in Statistics, TAU, March ‘17 By Aviv Navon
2
Estimates of Bias Chapter 10
3
Intro Up until now, we concentrated on standard error as a measure of accuracy of an estimator 𝜃 (denoted by 𝑠𝑒 𝐵 ). We will now focus on the bias of the estimator which defines as 𝑏𝑖𝑎 𝑠 𝐹 𝜃 ,𝜃 = 𝔼 𝐹 [ 𝜃 ]−𝑡(𝐹)
4
The Bootstrap Estimate of Bias
Consider the nonparametric one-sample situation, where 𝜃 =𝑠 𝑋 . We have, 𝑏𝑖𝑎 𝑠 𝐹 𝜃 ,𝜃 = 𝔼 𝐹 𝑠 𝑋 −𝑡(𝐹) We define the bootstrap estimate of bias as, 𝑏𝑖𝑎 𝑠 𝐹 = 𝔼 𝐹 𝑠 𝑋 ∗ −𝑡 𝐹 Note that 𝑏𝑖𝑎 𝑠 𝐹 is the plug-in estimate of 𝑏𝑖𝑎 𝑠 𝐹 . In general we will need to approximate 𝑏𝑖𝑎 𝑠 𝐹 using simulation: Generate 𝐵 bootstrap samples 𝑋 ∗1 ,…, 𝑋 ∗𝐵 Approximate 𝔼 𝐹 𝑠 𝑋 ∗ by 𝜃 ∗ ⋅ = 𝑏=1 𝐵 𝜃 ∗ 𝑏 /𝐵 = 𝑏=1 𝐵 𝑠 𝑋 ∗𝑏 /𝐵 Estimate bias by, 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ −𝑡 𝐹
5
Example – The Patch Data
Eight subjects wore medical patches designed to infuse a certain hormone into the blood stream. Each subject had his blood level of the hormone measured after wearing 3 different patches - old, new and placebo. We are interesting in the parameter, 𝜃= 𝔼 𝑛𝑒𝑤 −𝔼 𝑜𝑙𝑑 𝔼 𝑜𝑙𝑑 −𝔼 𝑝𝑙𝑎𝑐𝑒𝑏𝑜 Denote 𝑧=𝑜𝑙𝑑−𝑝𝑙𝑎𝑐𝑒𝑏𝑜, 𝑦=𝑛𝑒𝑤−𝑜𝑙𝑑, and 𝑥 𝑖 = 𝑧 𝑖 , 𝑦 𝑖 , where 𝑥 𝑖 obtained by random sampling an unknown distribution 𝐹. We have, 𝜃=𝑡 𝐹 = 𝔼 𝐹 𝑦 𝔼 𝐹 𝑧
6
The Patch Data – Cont. The plug-in estimate of 𝜃 is 𝜃 =𝑡 𝐹 = 𝑦 𝑧
𝜃 =𝑡 𝐹 = 𝑦 𝑧 We take 𝑠 𝑋 = 𝜃 = 𝑦 𝑧 , thus 𝜃 =− =−.0713 Next we generate 𝐵=400 bootstrap samples, and get 𝜃 ∗ ⋅ =− which yields, 𝑏𝑖𝑎𝑠 400 = 𝜃 ∗ ⋅ −𝑡 𝐹 =−.0670− −.0713 =.0043
7
The Patch Data – Cont. We know that 𝐵=400 is usually more than enough to obtain a good estimate of standard error. Is it enough to to obtain a good estimate of bias? As it turns out, in this particular case, the answer is no. * In general, biases are harder to estimate than standard errors.
8
An Improved Estimate of Bias
It turns out that there is a better method to approximate 𝑏𝑖𝑎𝑠 ∞ =𝑏𝑖𝑎 𝑠 𝐹 . The better method applies when 𝜃 =𝑡 𝐹 is the plug-in estimate of 𝑡 𝐹 . Notation For a given bootstrap sample 𝑋 ∗ = 𝑥 1 ∗ ,…, x n ∗ , we define the resampling vector as 𝑃 ∗ = 𝑝 1 ∗ ,…, 𝑝 𝑛 ∗ , where 𝑝 𝑗 ∗ = 𝑖 𝑥 𝑖 ∗ = 𝑥 𝑗 𝑛 . Denote 𝑃 0 = 1 𝑛 ,…, 1 𝑛 . For 𝜃 =𝑡 𝐹 , denote 𝜃 ∗ =𝑇 𝑃 ∗ (Note that 𝑇 𝑃 0 = 𝜃 =𝑡 𝐹 ). The bootstrap samples 𝑋 ∗1 ,…, 𝑋 ∗𝐵 are corresponding with resampling vectors 𝑃 ∗1 ,…, 𝑃 ∗𝐵 . Define 𝑃 ∗ to be the average of these vectors.
9
An Improved Estimate of Bias – Cont.
We can write the bootstrap bias estimate as, 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ −𝑇 𝑃 0 The better bootstrap bias estimate, denoted by 𝑏𝑖𝑎𝑠 𝐵 , is 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ −𝑇 𝑃 ∗ Both 𝑏𝑖𝑎𝑠 𝐵 and 𝑏𝑖𝑎𝑠 𝐵 converge to 𝑏𝑖𝑎𝑠 ∞ =𝑏𝑖𝑎 𝑠 𝐹 , the ideal bootstrap estimate of bias, as 𝐵→∞, but the convergence is much faster for 𝑏𝑖𝑎𝑠 𝐵 .
10
Back to the Patch Data Example
Note that for the patch example, 𝜃 ∗ =𝑇 𝑃 ∗ = ∑ 𝑝 𝑗 ∗ 𝑦 𝑗 ∑ 𝑝 𝑗 ∗ 𝑧 𝑗 Using the 400 bootstrap samples from before, we have 𝑇 𝑃 ∗ = ∑ 𝑝 𝑗 ∗ 𝑦 𝑗 ∑ 𝑝 𝑗 ∗ 𝑧 𝑗 =−.0750, and 𝑏𝑖𝑎𝑠 =−.0670− −.0750 =.0080, compared to 𝑏𝑖𝑎𝑠 400 =.0043. 𝑏𝑖𝑎𝑠 ∞ is approximated by 𝑏𝑖𝑎𝑠 =.0079, dashed horizontal line. 𝑏𝑖𝑎𝑠 𝐵 - solid line, 𝑏𝑖𝑎𝑠 𝐵 - dashed line.
11
Bias Correction If 𝑏𝑖𝑎𝑠 is an estimate of 𝑏𝑖𝑎 𝑠 𝐹 𝜃 ,𝜃 , then the obvious bias-corrected estimator is 𝜃 = 𝜃 − 𝑏𝑖𝑎𝑠 Taking 𝑏𝑖𝑎𝑠 equal to 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ − 𝜃 gives 𝜃 =2 𝜃 − 𝜃 ∗ ⋅ The bias correction can be dangerous to use in practice, due to high variability in 𝑏𝑖𝑎𝑠 . Correcting the bias may cause a larger increase in standard error (hence, a larger MSE).
12
Framework Real World 𝑃→𝑋=( 𝑥 1 ,…, 𝑥 𝑛 ) Bootstrap World
𝑃 → 𝑋 ∗ =( 𝑥 1 ∗ ,…, 𝑥 𝑛 ∗ ) 𝜃=𝜃 𝑃 𝜃 =𝑠(𝑋) 𝜃 =𝜃 𝑃 𝜃 ∗ =𝑠( 𝑋 ∗ ) 𝑏𝑖𝑎 𝑠 𝑃 𝜃 ,𝜃 𝑏𝑖𝑎 𝑠 𝑃 𝜃 ∗ , 𝜃
13
The Jackknife Chapter 11
14
Intro The jackknife predates the bootstrap and bears close similarities to it. We will present the jackknife method for estimating bias and standard error. Notation Let 𝑥 𝑖 = 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑖−1 , 𝑥 𝑖+1 ,…, 𝑥 𝑛 , for 𝑖=1,…,𝑛 - the 𝑖th jackknife samples. Let 𝜃 𝑖 =𝑠 𝑥 𝑖 be the 𝑖th jackknife replication of 𝜃 . Denote 𝜃 ⋅ = 𝑖=1 𝑛 𝜃 𝑖 /𝑛 .
15
Jackknife Estimates for Bias and SE
The jackknife estimate of bias is defined by 𝑏𝑖𝑎𝑠 𝑗𝑎𝑐𝑘 = 𝑛−1 𝜃 ⋅ − 𝜃 The jackknife estimate of standard error defined by 𝑠𝑒 𝑗𝑎𝑐𝑘 = 𝑛−1 𝑛 𝑖 𝜃 𝑖 − 𝜃 ⋅ Where do these formulae come from?
16
Back to the Patch Data Let us estimate bias and SE for the patch data, and the parameter 𝜃= 𝔼𝑦 𝔼𝑧 . Recall that 𝑏𝑖𝑎𝑠 400 =.0043, and 𝑏𝑖𝑎𝑠 400 = In addition, 𝑠𝑒 200 =.105. Using the jackknife method, we get 𝑏𝑖𝑎𝑠 𝑗𝑎𝑐𝑘 =.0080, 𝑠𝑒 𝑗𝑎𝑐𝑘 =.106
17
Jackknife vs. Bootstrap
The jackknife can be viewed as an approximation to the bootstrap: Consider a linear statistic of the form 𝜃 =𝑠 𝑋 =𝜇+ 1 𝑛 𝑖 𝛼 𝑥 𝑖 , where 𝛽 is a constant and 𝛼 ⋅ is a function (example 𝑋 : 𝜇=0,𝛼 𝑥 𝑖 = 𝑥 𝑖 ). For such a statistic, the jackknife and bootstrap estimate of SE agrees, up to a factor of 𝑛−1 𝑛 For non-linear statistics, the jackknife makes a linear approximation to the bootstrap: that is, it agrees with the bootstrap for a certain linear statistic that approximates 𝜃 . Similarly, the jackknife estimates of bias approximate the bootstrap in terms of quadratic statistics of the form 𝜃 =𝜇+ 1 𝑛 𝑖=1 𝑛 𝛼 𝑥 𝑖 𝑛 2 1≤𝑖<𝑗≤𝑛 𝛽 𝑥 𝑖 , 𝑥 𝑗 . * More details in Chapter 20
18
Failure of the Jackknife
The jackknife can fail miserably if the statistic 𝜃 is not “smooth”. A simple example is the median. For example, consider observations 10, 27, 31, 40, 46, 50, 52, 104, 146.We get, 𝑠𝑒 𝑗𝑎𝑐𝑘 =6.68. Using 100 bootstrap samples, we get 𝑠𝑒 100 =9.58, considerably larger than the jackknife value. Moreover, it can be shown that as 𝑛→∞, 𝑠𝑒 𝑗𝑎𝑐𝑘 fail to converge to the true standard error (for 𝜃 =𝑚𝑒𝑑𝑖𝑎𝑛 𝑋 ).
19
The Delete-𝒅 Jackknife
Instead of leaving out one observation at a time, we leave out 𝑑 observations, where 𝑛=𝑟⋅𝑑 for some integer 𝑟. Therefore, there are 𝑛 𝑑 jackknife samples (can get very large). The delete-𝑑 jackknife estimate of SE is given by: 𝑟 𝑛 𝑑 ∑ 𝜃 𝑠 − 𝜃 ⋅ , Where 𝜃 𝑠 denote 𝜃 applied to the data set with subset 𝑠 removed, 𝜃 ⋅ = ∑ 𝜃 𝑠 𝑛 𝑑 , and the sum is over all subsets 𝑠 of size 𝑛−𝑑. It can be shown that if 𝑑 is appropriately chosen, then the delete-𝑑 jackknife estimate of standard error of the median is consistent.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.