Baosheng Yuan and Kan Chen Statistical analysis of high-frequency financial data and modeling of financial time series Baosheng Yuan and Kan Chen Department of Computational Science, FoS, National University of Singapore 11/30/2018
Outline Introduction Financial assets and stochastic process Statistical tools Data analysis Non-Gaussian stochastic model Analysis of simulation data Model test: currency option pricing Summary
Introduction (1) What’s the focus of this talk Understand stock price dynamics: data analysis Simulate stock price process: modeling Option pricing: application of model simulation General properties of stock price Future return is unknown/uncertain No known process can describe it Larger volatility than Gaussian distribution Return/volatility is clustered -temporally correlated
Introduction (2) Widely used approaches Our approach Data analysis: Return probability distribution Modeling: Assume an independent stochastic process Option pricing: Patch-to-match methodology To match Black-Scholes option price to market price by adjusting implied volatility Our approach Data analysis: Conditional return analysis + etc. Modeling: assume prices are correlated Option pricing: Simulate the underlying stock process directly, and evaluate the performance by realized profit.
Financial assets & stochastic process Riskless: dS1(t)=r(t)S1(t)dt Risky: dS2(t)=μ(S2,t)dt + ν(S2,t)dW Where dW is an unknown stochastic process. Different assumption on dW leads to different model Itô process When {Wt: t ≥ 0} is assumed to be a Brownian motion. Return of Financial asset Discounted relative price change R(t)=[S(t)-S(t-Δt)] D(t)/S(t-Δt) where D(t) is a discounting factor, e.g. interest rate Logarithm return: X(t)= ln S(t) - ln S(t-Δt)
Analysis Tools Return density estimators Histogram Kernel estimators Statistics – moments of different orders Mean, variance, skewness, kurtosis Hurst function H(Δt)=E[max t<s<t+Δt{X(s)} – min t<s<t+Δt {X(s)}] Where E[.] is a mathematical expectation Conditional return *
Data analysis - Outline Global behavior of stock return Short-term price trend analysis Trend number analysis Clustered return analysis Conditional return analysis Return conditioned on previous return Threshold based clustered return Hurst exponent analysis
Data analysis (1) Global behavior of return Kurtosis: Skewness: Variance: Var[X] = E[(X-μ)2] Mean: μ=E[X] where E[.] is mathematical expectation
(Statistical) moments of HSI mean S.D.( ) Skewness Kurtosis Gaussian ~ 3.0 Δt=2mins 0.0008 -0.2769 25.6644 Δt=10mins 0.0023 -0.3849 15.6636 Δt=30mins 0.0042 -0.4092 8.7850 Δt=60mins 0.0059 -0.3018 7.3655 Δt=120mins 0.0086 -0.3858 6.5963 Δt=240mins 0.0126 -0.5251 5.9744
Data analysis(2) Global behavior of return Mean: indiscriminative measure Variance: >~ Δt (Gaussian: Var[X] ~ Δt): larger volatility than Gaussian Skewness << 0 (Gaussian, symetric) fatter negative tails (asymmetric) Kurtosis >> 3.0 (Gaussian) extreme events have higher probability than Gaussian =>The price process is highly non-Gaussian
Data analysis (3) Short-term price trend analysis Clustered return distribution Definition: accumulative return for a sequence of returns of same signs Clustered return over trend number Trend number: number of time steps in a sequence in which all the returns are the same signs. Average return per time step: clustered return over the trend number Dependence of clustered return on trend number Gaussian: independent Real high-frequency data ?
Data analysis (4) Conditional return analysis Return conditioned on previous return Definition: return distribution conditioned on absolute return in previous period where: X(t)= ln S(t) - ln S(t-Δt); and X1, X2 are the left and right boundary of previous absolute return. Plot of distribution
Data analysis (5) Conditional return analysis Conditional return distribution Highly non-Gaussian for all time steps SD of conditional return ~ absolute return in previous period Conditional return distributions collapse into a universal curve;. Conditional returns with different time steps also collapse into a universal curve – time scale free feature
Data analysis (6) Conditional return analysis Threshold based clustered return Definition: the length of price swing from a regional bottom to a regional top. The top and the bottom are defined such that the trend reversal between the regional top and bottom is less than R0 Distribution:
Data analysis (6) Conditional return analysis Threshold based clustered return Real data: power law tail with decaying factor ~ 2.0 Random walk: decaying factor ~ 2.7 => Real financial data is highly non-Gaussian and temporally correlated
Data analysis (7) Hurst exponent analysis Hurst function where E[.] is a mathematical expectation
Data analysis (7) Hurst exponent analysis Gaussian: Hurst exponent ~ 0.5 Real data: Hurst exponent ~ 0.6 for small time scales and --> 0.5 for larger time scales => volatility correlation is time-scale dependent, the shorter the time scale is, the stronger the correlation is.
Data analysis - summary Highly non-Gaussian: Variance: >~ Δt larger volatility than Gaussian Skewness << 0 (Gaussian, symetric) fatter negative tails (assymetric) Kurtosis >> 3.0 (Gaussian) extreme events have higher probability than Gaussian Highly temporally correlated SD of conditional return ~ absolute return in previous period the larger the volatility now, the larger the volatility next Volatility correlation has a decreasing dependence on time scales
Non-Gaussian stochastic model (1) Assumption: returns are correlated Model dynamics: X(t+1) = X(t) + r(t) ± δ(t) where r(t) is the intrinsic growth rate at time t and δ(t)= δ0γn(t) the magnitude of price (return) change due to trend. n(t+1)=n(t) +1 if price moves in a trend; or n(t+1)=n(t) -1 if price reverses the trend;
Non-Gaussian stochastic model(2) Interest rate: Applying a risk-neutral measure in binary tree we have: r(t)=r0 –ln {P(t) exp( δt)+(1-P(t)) exp(-δr)} if X(t)-X(t-1)>0 r(t)=r0 –ln {P(t) exp( -δt)+(1-P(t)) exp(δr)} else δt = δ0γn(t)+1 ; δr = δ0γn(t)-1 (1) Where P(t) is the probability of trend Volatility reversal: P(t) =P0 – α (n(t)-n0)/(n(t)+n0) if n(t)> n0, or P0 else; P0 = 1/(1+ γ2) (2) where γ is a volatility basis, r0 is risk-free interest rate, α is a constant factor for curtaining the volatility from over-shooting, and n0 is the mean of trend number n(t).
Non-Gaussian stochastic model(3) Return mean-reversal: δx(t) = β (S(t)-S0)/(S(t)+S0) (3) Where β is a constant factor for adjusting the magnitude of return, S(t) is the stock price at time t, and S0 is the “mean” of the price Model with volatility-/mean-reversal features: X(t+1) = X(t) + r(t) + δ(t) (4) Where δ(t) is determined by: + δ0 (1-δx(t)) γn(t)+1 if X(t) –X(t-1) ≥ 0 and R(t)<P(t) - δ0 (1+δx(t)) γn(t)-1 if X(t) –X(t-1) ≥ 0 and R(t) ≥P(t) - δ0 (1+δx(t)) γn(t)+1 if X(t) –X(t-1) <0 and R(t)<P(t) + δ0 (1-δx(t)) γn(t)-1 if X(t) –X(t-1) <0 and R(t) ≥ P(t) where R(t) ∈[0,1) is a random number generator. The model is fully described by Eq. (1)-(4).
Analysis of simulation data Simulated stock price Global return distribution Conditional return distribution
Model test: currency option pricing (1) European options C(K,T)=EQ[e-(r-rf)T (ST-K)+] P(K,T)=EQ[e-(r-rf)T (K- ST)+] where K is strike price, ST stock price at maturity T; r and rf are interest rate for domestic and foreign currencies respectively; EQ[ .] is an expectation under risk-neutral measure; The parameters: Initial trend number n(0): estimated by simulation Mean trend number n0: estimated by simulation. α, β and γ(>1) are determined by experiment δ0 is proportional to volatility of return to be simulated
Model test: currency option pricing (2) Trading strategy: buy the option only if the simulation price is higher than the market price Currency option pricing results: Profit/price Our Model Profit/Price BLS Model GBP Call+Put 0.5872 -0.2069 SWF Call+Put 0.1543 -0.1745 DMK Call 0.5499 -0.0110
Summary (1) Financial time series: Characterized by “transient” and “recurrent” dynamics according to Hurst exponent Returns are more volatile than Gaussian Variance > Δt (small time scalse): more volatile Kurtosis >> 3 (for Gaussian): extreme returns have larger probability than Gaussian process Skewness < 0 (Gaussian): Large positive return moves towards Gaussian faster than large negative return does Returns are temporally correlated Future volatility is proportional to the current one (universal curve of conditional return distributions) Correlation depends on time scale: the shorter the time scale is, the stronger the correlation is
Summary (2) Unique features of our model Assume a non-Gaussian and correlated stochastic process Incorporate short-time price trend and long-time mean-reversal Price dynamics is easy to simulate and model structure is simple and intuitive Capture important statistics observed in real data Can be used directly in option pricing without using implied volatility approach
Thank You!