Download presentation
Presentation is loading. Please wait.
Published byEustace Shields Modified over 9 years ago
1
Bootstrap Event Study Tests Peter Westfall ISQS Dept. Joint work with Scott Hein, Finance
2
An Example of an “Event”
3
Event (Outlier) Detection Main Idea: y 0 is an “outlier” if it is unusual with respect to “typical circumstances”. Definitions: –Critical value: The threshold c that y 0 must exceed to be called an outlier – level: The probability that Y 0 exceeds c under typical circumstances – p-value: The probability that Y 0 exceeds the particular observed value y 0 under typical circumstances
4
Case 1: Normal distribution, known mean ( ), known variance ( 2 ). Let Y 0 is associated with an “event” if Z is large. Critical and p-values are from Z distribution. Ex: y 0 = -7.13, =-.15, =1.0 Z=-6.98. =.05 critical value: Z /2 = 1.96. p-value = 2P(Z<-6.98) = 3E-12
5
Case 2: Normal distribution, unknown known 2. Let Y 1,…,Y n denote an i.i.d. sample under typical circumstances (excluding Y 0 ). Then
6
Case 3: Normal distribution, unknown unknown 2. Let Y 1,…,Y n denote an i.i.d. sample under typical circumstances (excluding Y 0 ). Then Critical and p-values are from t n-1 distribution. Example: n=87, y 0 = -7.13, =-.14, s=1.013 T=-6.86. =.05 critical value: t 87-1, /2 = 1.99. p-value = 2P(T 87 <-6.86) =1E-9
9
Notes The method is essentially asking, “how far into the tail of the typical distribution is y 0 ”? (Estimation of the mean just gives a minor correction: (1+ 1/n) in the variance formula; Estimation of the variance gives another minor correction: T n-1 instead of Z critical and p-values) The central limit theorem does not apply since we are concerned with the distribution of Y 0, not the distribution of
10
The Distribution of (Y 0 - )/
11
Case 1A: Known Distribution Exact critical values for Z are c L = { /2 quantile of distribution of Z} c U = {1- /2 quantile of distribution of Z} Exact P-Value: p-value = 2 min{ P(Z z), P(Z z) }
12
A Simulation-Based Approach Simulate “many” (1,000s) of Z’s at random from the pdf Critical values: –c L is the 100( /2) percentile of the simulated data –c U is the 100(1- /2) percentile of the simulated data P-value: –p L = {proportion of simulated Z’s that are smaller than z. –p U = proportion of simulated Z’s that are larger than z. –P-value = 2{min(p L, p U )}.
13
Case 1B: Unknown Distribution Let Y 1,…,Y n denote an i.i.d. sample under typical circumstances (excluding Y 0 ). Then the empirical pdf approximates the true pdf if n is large (Glivenko-Cantelli Theorem). Thus, approximate critical and p-values can be obtained by using the empirical distribution. This is the essential nature of the “bootstrap.”
14
Case 1B.i: Simulation-Based Approach with known , Simulate 1000’s of values of Z = (Y 0 – )/ as follows: 1.Select a value Y 01 at random from the observed data Y 1,…,Y n ; let Z 1 = (Y 01 – )/ 2.Select a value Y 02 at random from the observed data Y 1,…,Y n ; let Z 2 = (Y 02 – )/ … B. Select a value Y 0B at random from the observed data Y 1,…,Y n ; let Z B = (Y 0B – )/ Use the simulated data Z 1,…,Z B to determine critical and p-values.
15
Case 1B.ii: Unknown , Use the statistic The distribution of the statistic depends on the randomness inherent in
16
Case 1B.ii: Simulation-Based Approach
18
Extension: Market Model
19
Extension: Multivariate Market Model The MVRM models may be expressed as R i = X i + D i + i, for i= 1,…,g (firms or portfolios). Observations within a row of [ 1 | … | g ] are correlated; this is called “cross-sectional” correlation. Observations on [ 1 | … | g ] between rows 1,…,n are assumed to be independent in the classical MVRM model. Null hypothesis: H 0 : [ 1 | … | g ] = [0 | … | 0] This multivariate test is computed easily and automatically using standard statistical software packages, using exact (under normality) F-tests. The test is based on Wilks’ Lambda likelihood ratio criterion.
20
Hein, Westfall, Zhang Bootstrap Method 1.Fit the MVRM model. Obtain the F-statistic for testing H 0 using the traditional method (assuming normality). Obtain also the ((n+1) g) sample residual matrix e = [e 1 | … | e g ]. 2.Exclude the row corresponding to event from e, leaving the (n g) matrix e -. 3.Sample (n+1) row vectors, one at a time and with replacement, from e -. This gives a ((n+1) g) matrix [ R 1 * | … | R g * ]. 4.Fit the model R i * = X i + D i + i, i = 1, …, g, and obtain the test statistic F* using the same technique used to obtain the F- statistic from the original sample. 5.Repeat 3 and 4 NBOOT times. The bootstrap p-value of the test is the proportion of the NBOOT samples yielding an F*- statistic that is greater than or equal to the original F-statistic from step 1.
21
Simulation Study: True Type I error rates
23
Alternative Method (Kramer,2001) Test statistic is Z = t i /(g 1/2 s t ), where t i is the t-statistic from the univariate dummy-variable-based regression model for firm i, and s t is the sample standard deviation of the g t-statistics. Algorithm: (i) create a pseudo-population of t-statistics t i * = t i - reflecting the null hypothesis case where the true mean of the t-statistics is zero, (ii) sample g values with replacement from the pseudo-population and compute Z* from these pseudo-values, (iii) repeat (ii) NBOOT times, obtaining Z 1 *, …, Z b *. The p-value for the test is then 2*min(p U, p L ), where p L is the proportion of the NBOOT bootstrap samples yielding Z i * Z, and where p U is the proportion of the NBOOT samples yielding Z i * Z. Assumption: The statistics are cross-sectionally independent
24
Modified Kramer Method Model-Based bootstrap Kramer: Bootstrap Kramer’s Z = t i /(g 1/2 s t ), but by resampling MVRM residual vectors as in HWZ. Model-based sum t: Bootstrap S t = t i by resampling MVRM residual vectors as in HWZ.
25
Table 1. Simulated Type I error rates as a function of cross-sectional correlation.
28
/*--------------------------------------------------------------*/ /* Name: bootevnt */ /* Title: Macro to calculate bootstrap p-values for event */ /* studies */ /* Author: Peter H. Westfall, westfall@ttu.edu */ /* Release: SAS Version 6.12 or higher, requires SAS/IML */ /*--------------------------------------------------------------*/ /* Inputs: */ /* */ /* DATASET = Data set to be analyzed (required) */ /* */ /* YVARS = List of y variables used in the multivariate */ /* regression model, separated by blanks (required) */ /* */ /* XVARS = List of x variables used in the multivariate */ /* regression model, separated by blanks (required) */ /* */ /* EVENT = Name of dummy variable indicating event */ /* observation (e.g., day). This is required. */ /* */ /* EXCLUDE = Name of dummy variable indicating days that */ /* should be excluded from the resampling. If there */ /* are multiple event days in the model, then all */ /* those days should be excluded because the */ /* residuals are mathematically zero. If there are */ /* not multiple eventdays, then the EXCLUDE */ /* variable should be identical to the EVENT */ /* variable. */ /* */ /* NBOOT = Number of bootstrap samples. This input is */ /* required. Pick a number as large as possible */ /* subject to time constraints. Start with 100 */ /* and work your way up, noting the accuracy as */ /* given by the confidence interval in the output. */ /* */ /* MODELBOOT = 1 for requesting model-based bootstrap tests, */ /* = 0 to exclude them. */ /* */ /* NPBOOT = 1 to request Kramer's nonparametric bootstrap */ /* tests, =0 to exclude them. */ /* */ /* SEED = Seed value for random numbers (0 default) */ /* */ /*--------------------------------------------------------------*/ /* Output: This macro computes normality-assuming exact p- */ /* values and bootstrap approximate p-values that do not */ /* require the normality assumption. A 95% confidence interval */ /* for the true bootstrap p-value (which itself is approximate */ /* because it uses the empirical, not the true, residual */ /* distribution) also is given. */ /*--------------------------------------------------------------*/
29
Invocation of Macro libname fin "c:\research\coba"; data sinkey; set fin.sinkey; run; %bootevnt(dataset=sinkey, yvars=pr1 pr2 pr3 pr4, xvars=ds m1 m2 m3 dsm d2 d3 d4 d5 d6, event=d1, exclude=exclude, nboot=1000, modelboot=1, npboot=1, seed=182161);
30
Normality-Assuming Tests for Event TSQ F NDF DDF PVAL 15.025505 3.6957895 4 183 0.0064153 NBOOT Model-based bootstrap Binder p-value, using 20000 samples with 95% confidence limits on the true bootstrap p-value BOOTP LCL UCL 0.01115 0.0096947 0.0126053
31
Model-based bootstrap Kramer p-value, using 20000 samples with 95% confidence limits on the true bootstrap p-value BOOTKP LCLK UCLK 0.0609 0.0561373 0.0656627 NBOOT Model-based bootstrap Sum t p-value, using 20000 samples with 95% confidence limits on the true bootstrap p-value BOOTTSUMP LCLSUMT UCLSUMT 0.0001 -0.000096 0.000296
32
1.55 % of the bootstrap samples had 0 variance NBOOT Nonparametric bootstrap Kramer p-value, using 20000 samples with 95% confidence limits on the true bootstrap p-value BOOTTNP LCLNP UCLNP 0.1404 0.1333184 0.1452147
33
Robustness of Bootstrap to Serial Correlation Recall that the method is essentially a comparison of Y 0 to the distribution of Y 1,…,Y n. If the empirical distribution of Y 1,…,Y n converges to F, then the unconditional null probability of an “event” also converges to =F(c /2 ) + (1-F(c /2 )). Such convergence occurs for typical stationary time series processes.
34
Conclusions We use t, not z even when n is large. Why? Because t is generally more accurate. We should use bootstrap tests instead of traditional tests for precisely the same reason. We must account for cross-sectional correlation in the analysis. The recommended method is our bootstrap with a modification of Kramer’s Z (The model-based sum t method) Software is available from westfall@ba.ttu.eduwestfall@ba.ttu.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.