Applications of bootstrap method to finance Chin-Ping King.

Slides:



Advertisements
Similar presentations
Chapter 9 Introduction to the t-statistic
Advertisements

Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Hypothesis testing and confidence intervals by resampling by J. Kárász.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Sampling: Final and Initial Sample Size Determination
ENGR 4296 – Senior Design II Question: How do you establish your “product” is better or your experiment was a success? Objectives: Sources of Variability.
Sampling Distributions (§ )
Chapter 7 Sampling and Sampling Distributions
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Topic 2: Statistical Concepts and Market Returns
2. Point and interval estimation Introduction Properties of estimators Finite sample size Asymptotic properties Construction methods Method of moments.
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
Experimental Evaluation
Chapter 11: Inference for Distributions
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
CH7 Distribution Free Inference: Computer-Intensive Techniques 1.Random Sampling 2.Bootstrap sampling 3.Bootstrap Testing.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
5-3 Inference on the Means of Two Populations, Variances Unknown
Chapter 12 Section 1 Inference for Linear Regression.
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
AM Recitation 2/10/11.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Topic 5 Statistical inference: point and interval estimate
1 SAMPLE MEAN and its distribution. 2 CENTRAL LIMIT THEOREM: If sufficiently large sample is taken from population with any distribution with mean  and.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Random Sampling, Point Estimation and Maximum Likelihood.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Estimation Chapter 8. Estimating µ When σ Is Known.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Confidence Interval & Unbiased Estimator Review and Foreword.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Section 6.2 Confidence Intervals for the Mean (Small Samples) Larson/Farber 4th ed.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 4.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
Distributions of Sample Means. z-scores for Samples  What do I mean by a “z-score” for a sample? This score would describe how a specific sample is.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
PEP-PMMA Training Session Statistical inference Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
AP STATISTICS LESSON 11 – 1 (DAY 2) The t Confidence Intervals and Tests.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Estimating standard error using bootstrap
Chapter 6: Sampling Distributions
Sampling and Sampling Distributions
Inference for the Mean of a Population
Application of the Bootstrap Estimating a Population Mean
ESTIMATION.
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Chapter 4. Inference about Process Quality
Statistics in Applied Science and Technology
Bootstrap - Example Suppose we have an estimator of a parameter and we want to express its accuracy by its standard error but its sampling distribution.
Econ 3790: Business and Economics Statistics
Some Nonparametric Methods
BOOTSTRAPPING: LEARNING FROM THE SAMPLE
Chapter Outline Inferences About the Difference Between Two Population Means: s 1 and s 2 Known.
Sampling Distributions (§ )
How Confident Are You?.
Presentation transcript:

Applications of bootstrap method to finance Chin-Ping King

Population distribution function F empirical distribution function(EDF) F n F (x 1, x 2,…, x n ) where x= (x 1, x 2,…, x n ) F n (x 1 *, x 2 *,…, x n * ) where x * = (x 1 *, x 2 *,…, x n * ) Probability of elements of population which occur in x : P Probability of elements of EDF which occur in x * : P n P n ~ Bi(P, (p*(1-p)/n)) By Weak Law of Large Number and Central Limit Theorem (n) 1/2 (F n - F) d N(0, p*(1-p))

Estimation of standard deviation and bias Estimator of θ : θ ’ θ ’ =s(x) Standard deviation: se={∑ n j=1 [θ ’ j - θ ’ (.)] 2 /(n-1)} se(.)= ∑ n j=1 θ ’ j /n Bias: bias=E[θ ’ ]- θ Root mean square error of an estimator θ ’ for θ: E[(θ ’ - θ) 2 ]= se 2 *{1+(1/2)*(bias/se)} 2

Nonparametric bootstrap The bootstrap algorithm for estimating standard errors(or bias) 1. Select B independent bootstrap samples x *1, x *2, …, x *B, each consisting of n data drawn with replacement from x. Total possible number of distinct bootstrap samples is C(2n-1,n). 2. Evaluate the bootstrap replication corresponding to each bootstrap sample θ ’* (b)=s(x *b ) b=1,2,…,B 3. Estimate the standard error (or bias) by the sample standard deviation (or bias) of the B replications: se ’ B = se={∑ B b=1 [θ ’* (b)- θ ’* (.)] 2 /(B-1)} θ ’* (.)= ∑ B b=1 θ ’* (b) /B Bias ’ B =θ ’* (.)- θ ’

A Schematic diagram of the nonparametric bootstrap Unknown Observed Random Empirical Bootstrap Population Sample Distribution Sample Distribution F x= (x 1, x 2,…, x n ) F n x * = (x 1 *, x 2 *,…, x n * ) θ ’ =s(x) θ ’* (b)=s(x *b ) Statistic of interest Bootstrap Replication

Parametric bootstrap Function form of population probability distribution F has been known, but parameters in population probability distribution F are not known Parametric estimate of population probability distribution : F par We draw B samples of size n from the parametric estimate of estimate of the population probability distribution F par : F par x * = (x 1 *, x 2 *,…, x n * )

Error in bootstrap estimates m i = the ith moment of the bootstrap distribution of θ ’ Var(se ’ B ) = Var(m 2 1/2 ) + E[(m 2 ( △ +2))/4B] △ = m 4 /m , the kurtosis of the bootstrap distribution of θ ’ Var(m 2 1/2 ):sample variation, it approaches zero as the sample size n approaches infinity E[(m 2 ( △ +2))/4B]:resampling variation, it approaches zero as B approaches infinity

Confidence intervals based on bootstrap percentiles (1-α) Percentile interval: [θ ’ % low, θ ’ % up ]= [θ ’*(α/2) B, θ ’*(1-(α/2)) B ] θ ’*(α/2) B : 100*(α/2)th empirical percentile, or B *(α/2)th value in the ordered list of the B replications of θ ’* θ ’*(1-(α/2)) B : 100*(1-(α/2))th empirical percentile, or B *(1-(α/2))th value in the ordered list of the B replications of θ ’*

Percentile interval lemma Suppose the transformation ψ ’ =t(θ ’ ) perfectly normalize the distribution of θ ’ : ψ ’ ~ N(ψ, c 2 ) For some standard deviation c. Then the percentile interval based on θ ’ equals [t -1 (ψ ’ -z (1-(α/2)) *c), t -1 (ψ ’ -z (α/2) *c)] Example: θ ’ =exp(x) x ~ N(0,1) ψ ’ =t(θ ’ )=logθ ’

Coverage performance Results of 300 confidence interval realizations for θ ’ =exp(x) Method % miss left % miss right Standard normal Interval Bootstrap percentile Interval miss left: left endpoint >1 Miss right: right endpoint <1

Transformation-respecting property The percentile interval for any (monotone) parameter transformation ψ ’ =t(θ ’ ) is simply the percentile interval for θ ’ mapped by t(θ ’ ) : [ψ ’ % low, ψ ’ % up ]= [t(θ ’ % low ), t(θ ’ % up ) ]

Better bootstrap confidence intervals (1-α) BC a interval: [θ ’ low, θ ’ up ]= [θ ’*(α1), θ ’*(α2) ] α1 and α2 are obtained by standard normal cumulative distribution function of some correction formulas for bootstrap replications. BC a interval is transformation respecting.

Accuracy of bootstrap confidence interval For (1- α )coverage, approximate confidence interval points θ ’ low and θ ’ up are called first order accurate if: Pr(θ ≦ θ ’ low )= (α/2 )+ O(n -1/2 ) Pr(θ ≧ θ ’ up )= (α /2)+ O(n -1/2 ) And second order accurate if Pr(θ ≦ θ ’ low )= (α/2 )+ O(n -1 ) Pr(θ ≧ θ ’ up )= (α/2 )+ O(n -1 ) Percentile interval : first order accurate. BC a interval : second order accurate.

Calibration of confidence interval points 1.Generate B bootstrap samples x *1, x *2, …, x *B. For each sample b=1,2,…,B: 1a) Compute a λ-level confidence interval point θ ’* λ (b) for a grid of values of λ. Where θ ’* λ (b) can be θ ’* (b)-z 1-λ *se ’* (b). 2. For each λ compute p ’ (λ)=#{θ ’ ≦ θ ’* λ (b) }/B. 3. Find the value of λ satisfying p ’ (λ)= α/2

Calibration of percentile interval and BC a interval Once calibration of percentile interval: second order accurate Pr(θ ≦ θ ’ low )= (α/2 )+ O(n -1 ) Pr(θ ≧ θ ’ up )= (α/2 )+ O(n -1 ) Once calibration of BC a interval: third order accurate Pr(θ ≦ θ ’ low )= (α/2 )+ O(n -3/2 ) Pr(θ ≧ θ ’ up )= (α /2)+ O(n -3/2 )

Computation of the bootstrap test statistics 1.Draw B samples of size n with replacement from x. 2.Evaluate ϕ(.) on each sample, ϕ(x *b ) where ϕ(.) is test statistics b=1,2,…,B 3. Approximate P-value by P-value=#{ϕ(x *b ) ≧ ϕ obs }/B or P-value=#{ϕ(x *b ) ≦ ϕ obs }/B Where ϕ obs = ϕ(x) the observed value of test statistics

Asymptotic refinement Asymptotically normal test statistics ϕ ϕ d N(0,σ 2 ) ϕ ~ G n (u,F) G n (u,F): exact cumulative distribution G n (u,F)=Pr(|ϕ| ≦ u|F) G n (u,F) φ(u) as n approaches infinity (assume σ=1) φ(u): standard normal cumulative distribution

An asymptotic test is based on φ(u) φ(u)- G n (u,F)=O(n -1 ) G * n (u): bootstrap cumulative distribution A bootstrap test is based on G * n (u) G * n (u)-G n (u,F)= O(n -3/2 )

Reality test for data snooping Forecasting model: l k Benchmark model: l 0 d k = l k - l 0 H 0 :max k=1,2,…,n E(d k ) ≦ 0 Data: 1000 daily closing stock prices of UMC Benchmark model: random walk with drift lnP t = a + lnP t-1 + ε t

Forecasting model : lnP t = a + ΔlnP t-1 + ε t where ΔlnP t = lnP t – lnP t-1 V=(1/B) ∑ B b=1 d 1 (b) Quantile of bootstrap distribution Statistics V for V Critical value * The difference is significant, so reject H 0 Forecasting model beat random walk model

Inference when a nuisance parameter is not identified the null hypothesis Threshold Autoregressive (TAR)model: α 10 + α 11 y t -1 + ε 1t y t -1 ≦ η y t = α 20 + α 21 y t -1 + ε 2t y t -1 > η η : threshold value H 0 : time series is linear H 1 : time series is TAR process

Data: monthly data of U.S. dollar/Sweden krona exchange rate from January 1974 to December 1998 U.S. dollar/Sweden krona Bootstrap P-value Reject H 0 U.S. dollar/Sweden krona exchange rates follow TAR process

Bootstrap percentile confidence interval

Thanks for listening