Download presentation
Presentation is loading. Please wait.
Estimates of Bias & The Jackknife
”An Introduction to the Bootstrap”, Bradley Efron & Robert J. Tibshirani, Chapters 10-11 M.Sc. Seminar in Statistics, TAU, March ‘17 By Aviv Navon
Estimates of Bias Chapter 10
Intro Up until now, we concentrated on standard error as a measure of accuracy of an estimator 𝜃 (denoted by 𝑠𝑒 𝐵 ). We will now focus on the bias of the estimator which defines as 𝑏𝑖𝑎 𝑠 𝐹 𝜃 ,𝜃 = 𝔼 𝐹 [ 𝜃 ]−𝑡(𝐹)
The Bootstrap Estimate of Bias
Consider the nonparametric one-sample situation, where 𝜃 =𝑠 𝑋 . We have, 𝑏𝑖𝑎 𝑠 𝐹 𝜃 ,𝜃 = 𝔼 𝐹 𝑠 𝑋 −𝑡(𝐹) We define the bootstrap estimate of bias as, 𝑏𝑖𝑎 𝑠 𝐹 = 𝔼 𝐹 𝑠 𝑋 ∗ −𝑡 𝐹 Note that 𝑏𝑖𝑎 𝑠 𝐹 is the plug-in estimate of 𝑏𝑖𝑎 𝑠 𝐹 . In general we will need to approximate 𝑏𝑖𝑎 𝑠 𝐹 using simulation: Generate 𝐵 bootstrap samples 𝑋 ∗1 ,…, 𝑋 ∗𝐵 Approximate 𝔼 𝐹 𝑠 𝑋 ∗ by 𝜃 ∗ ⋅ = 𝑏=1 𝐵 𝜃 ∗ 𝑏 /𝐵 = 𝑏=1 𝐵 𝑠 𝑋 ∗𝑏 /𝐵 Estimate bias by, 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ −𝑡 𝐹
Example – The Patch Data
Eight subjects wore medical patches designed to infuse a certain hormone into the blood stream. Each subject had his blood level of the hormone measured after wearing 3 different patches - old, new and placebo. We are interesting in the parameter, 𝜃= 𝔼 𝑛𝑒𝑤 −𝔼 𝑜𝑙𝑑 𝔼 𝑜𝑙𝑑 −𝔼 𝑝𝑙𝑎𝑐𝑒𝑏𝑜 Denote 𝑧=𝑜𝑙𝑑−𝑝𝑙𝑎𝑐𝑒𝑏𝑜, 𝑦=𝑛𝑒𝑤−𝑜𝑙𝑑, and 𝑥 𝑖 = 𝑧 𝑖 , 𝑦 𝑖 , where 𝑥 𝑖 obtained by random sampling an unknown distribution 𝐹. We have, 𝜃=𝑡 𝐹 = 𝔼 𝐹 𝑦 𝔼 𝐹 𝑧
The Patch Data – Cont. The plug-in estimate of 𝜃 is 𝜃 =𝑡 𝐹 = 𝑦 𝑧
𝜃 =𝑡 𝐹 = 𝑦 𝑧 We take 𝑠 𝑋 = 𝜃 = 𝑦 𝑧 , thus 𝜃 =− =−.0713 Next we generate 𝐵=400 bootstrap samples, and get 𝜃 ∗ ⋅ =− which yields, 𝑏𝑖𝑎𝑠 400 = 𝜃 ∗ ⋅ −𝑡 𝐹 =−.0670− −.0713 =.0043
The Patch Data – Cont. We know that 𝐵=400 is usually more than enough to obtain a good estimate of standard error. Is it enough to to obtain a good estimate of bias? As it turns out, in this particular case, the answer is no. * In general, biases are harder to estimate than standard errors.
An Improved Estimate of Bias
It turns out that there is a better method to approximate 𝑏𝑖𝑎𝑠 ∞ =𝑏𝑖𝑎 𝑠 𝐹 . The better method applies when 𝜃 =𝑡 𝐹 is the plug-in estimate of 𝑡 𝐹 . Notation For a given bootstrap sample 𝑋 ∗ = 𝑥 1 ∗ ,…, x n ∗ , we define the resampling vector as 𝑃 ∗ = 𝑝 1 ∗ ,…, 𝑝 𝑛 ∗ , where 𝑝 𝑗 ∗ = 𝑖 𝑥 𝑖 ∗ = 𝑥 𝑗 𝑛 . Denote 𝑃 0 = 1 𝑛 ,…, 1 𝑛 . For 𝜃 =𝑡 𝐹 , denote 𝜃 ∗ =𝑇 𝑃 ∗ (Note that 𝑇 𝑃 0 = 𝜃 =𝑡 𝐹 ). The bootstrap samples 𝑋 ∗1 ,…, 𝑋 ∗𝐵 are corresponding with resampling vectors 𝑃 ∗1 ,…, 𝑃 ∗𝐵 . Define 𝑃 ∗ to be the average of these vectors.
An Improved Estimate of Bias – Cont.
We can write the bootstrap bias estimate as, 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ −𝑇 𝑃 0 The better bootstrap bias estimate, denoted by 𝑏𝑖𝑎𝑠 𝐵 , is 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ −𝑇 𝑃 ∗ Both 𝑏𝑖𝑎𝑠 𝐵 and 𝑏𝑖𝑎𝑠 𝐵 converge to 𝑏𝑖𝑎𝑠 ∞ =𝑏𝑖𝑎 𝑠 𝐹 , the ideal bootstrap estimate of bias, as 𝐵→∞, but the convergence is much faster for 𝑏𝑖𝑎𝑠 𝐵 .
Back to the Patch Data Example
Note that for the patch example, 𝜃 ∗ =𝑇 𝑃 ∗ = ∑ 𝑝 𝑗 ∗ 𝑦 𝑗 ∑ 𝑝 𝑗 ∗ 𝑧 𝑗 Using the 400 bootstrap samples from before, we have 𝑇 𝑃 ∗ = ∑ 𝑝 𝑗 ∗ 𝑦 𝑗 ∑ 𝑝 𝑗 ∗ 𝑧 𝑗 =−.0750, and 𝑏𝑖𝑎𝑠 =−.0670− −.0750 =.0080, compared to 𝑏𝑖𝑎𝑠 400 =.0043. 𝑏𝑖𝑎𝑠 ∞ is approximated by 𝑏𝑖𝑎𝑠 =.0079, dashed horizontal line. 𝑏𝑖𝑎𝑠 𝐵 - solid line, 𝑏𝑖𝑎𝑠 𝐵 - dashed line.
Bias Correction If 𝑏𝑖𝑎𝑠 is an estimate of 𝑏𝑖𝑎 𝑠 𝐹 𝜃 ,𝜃 , then the obvious bias-corrected estimator is 𝜃 = 𝜃 − 𝑏𝑖𝑎𝑠 Taking 𝑏𝑖𝑎𝑠 equal to 𝑏𝑖𝑎𝑠 𝐵 = 𝜃 ∗ ⋅ − 𝜃 gives 𝜃 =2 𝜃 − 𝜃 ∗ ⋅ The bias correction can be dangerous to use in practice, due to high variability in 𝑏𝑖𝑎𝑠 . Correcting the bias may cause a larger increase in standard error (hence, a larger MSE).
Framework Real World 𝑃→𝑋=( 𝑥 1 ,…, 𝑥 𝑛 ) Bootstrap World
𝑃 → 𝑋 ∗ =( 𝑥 1 ∗ ,…, 𝑥 𝑛 ∗ ) 𝜃=𝜃 𝑃 𝜃 =𝑠(𝑋) 𝜃 =𝜃 𝑃 𝜃 ∗ =𝑠( 𝑋 ∗ ) 𝑏𝑖𝑎 𝑠 𝑃 𝜃 ,𝜃 𝑏𝑖𝑎 𝑠 𝑃 𝜃 ∗ , 𝜃
The Jackknife Chapter 11
Intro The jackknife predates the bootstrap and bears close similarities to it. We will present the jackknife method for estimating bias and standard error. Notation Let 𝑥 𝑖 = 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑖−1 , 𝑥 𝑖+1 ,…, 𝑥 𝑛 , for 𝑖=1,…,𝑛 - the 𝑖th jackknife samples. Let 𝜃 𝑖 =𝑠 𝑥 𝑖 be the 𝑖th jackknife replication of 𝜃 . Denote 𝜃 ⋅ = 𝑖=1 𝑛 𝜃 𝑖 /𝑛 .
Jackknife Estimates for Bias and SE
The jackknife estimate of bias is defined by 𝑏𝑖𝑎𝑠 𝑗𝑎𝑐𝑘 = 𝑛−1 𝜃 ⋅ − 𝜃 The jackknife estimate of standard error defined by 𝑠𝑒 𝑗𝑎𝑐𝑘 = 𝑛−1 𝑛 𝑖 𝜃 𝑖 − 𝜃 ⋅ Where do these formulae come from?
Back to the Patch Data Let us estimate bias and SE for the patch data, and the parameter 𝜃= 𝔼𝑦 𝔼𝑧 . Recall that 𝑏𝑖𝑎𝑠 400 =.0043, and 𝑏𝑖𝑎𝑠 400 = In addition, 𝑠𝑒 200 =.105. Using the jackknife method, we get 𝑏𝑖𝑎𝑠 𝑗𝑎𝑐𝑘 =.0080, 𝑠𝑒 𝑗𝑎𝑐𝑘 =.106
Jackknife vs. Bootstrap
The jackknife can be viewed as an approximation to the bootstrap: Consider a linear statistic of the form 𝜃 =𝑠 𝑋 =𝜇+ 1 𝑛 𝑖 𝛼 𝑥 𝑖 , where 𝛽 is a constant and 𝛼 ⋅ is a function (example 𝑋 : 𝜇=0,𝛼 𝑥 𝑖 = 𝑥 𝑖 ). For such a statistic, the jackknife and bootstrap estimate of SE agrees, up to a factor of 𝑛−1 𝑛 For non-linear statistics, the jackknife makes a linear approximation to the bootstrap: that is, it agrees with the bootstrap for a certain linear statistic that approximates 𝜃 . Similarly, the jackknife estimates of bias approximate the bootstrap in terms of quadratic statistics of the form 𝜃 =𝜇+ 1 𝑛 𝑖=1 𝑛 𝛼 𝑥 𝑖 𝑛 2 1≤𝑖<𝑗≤𝑛 𝛽 𝑥 𝑖 , 𝑥 𝑗 . * More details in Chapter 20
Failure of the Jackknife
The jackknife can fail miserably if the statistic 𝜃 is not “smooth”. A simple example is the median. For example, consider observations 10, 27, 31, 40, 46, 50, 52, 104, 146.We get, 𝑠𝑒 𝑗𝑎𝑐𝑘 =6.68. Using 100 bootstrap samples, we get 𝑠𝑒 100 =9.58, considerably larger than the jackknife value. Moreover, it can be shown that as 𝑛→∞, 𝑠𝑒 𝑗𝑎𝑐𝑘 fail to converge to the true standard error (for 𝜃 =𝑚𝑒𝑑𝑖𝑎𝑛 𝑋 ).
The Delete-𝒅 Jackknife
Instead of leaving out one observation at a time, we leave out 𝑑 observations, where 𝑛=𝑟⋅𝑑 for some integer 𝑟. Therefore, there are 𝑛 𝑑 jackknife samples (can get very large). The delete-𝑑 jackknife estimate of SE is given by: 𝑟 𝑛 𝑑 ∑ 𝜃 𝑠 − 𝜃 ⋅ , Where 𝜃 𝑠 denote 𝜃 applied to the data set with subset 𝑠 removed, 𝜃 ⋅ = ∑ 𝜃 𝑠 𝑛 𝑑 , and the sum is over all subsets 𝑠 of size 𝑛−𝑑. It can be shown that if 𝑑 is appropriately chosen, then the delete-𝑑 jackknife estimate of standard error of the median is consistent.
Similar presentations
© 2025 Inc.
All rights reserved.