Factor Model Statistical Arbitrage

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Tests of Static Asset Pricing Models
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Chapter 21 Value at Risk Options, Futures, and Other Derivatives, 8th Edition, Copyright © John C. Hull 2012.
Regression Greg C Elvers.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
The General Linear Model. The Simple Linear Model Linear Regression.
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Visual Recognition Tutorial
Primbs, MS&E 345, Spring The Analysis of Volatility.
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
Simple Linear Regression
Curve-Fitting Regression
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Scenario Generation for the Asset Allocation Problem Diana Roman Gautam Mitra EURO XXII Prague July 9, 2007.
Linear and generalised linear models
Linear and generalised linear models
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
1 ASSET ALLOCATION. 2 With Riskless Asset 3 Mean Variance Relative to a Benchmark.
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
Regression and Correlation Methods Judy Zhong Ph.D.
1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.
Options, Futures, and Other Derivatives 6 th Edition, Copyright © John C. Hull Chapter 18 Value at Risk.
Value at Risk.
Risk Management and Financial Institutions 2e, Chapter 13, Copyright © John C. Hull 2009 Chapter 13 Market Risk VaR: Model- Building Approach 1.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Modern Portfolio Theory. History of MPT ► 1952 Horowitz ► CAPM (Capital Asset Pricing Model) 1965 Sharpe, Lintner, Mossin ► APT (Arbitrage Pricing Theory)
CSDA Conference, Limassol, 2005 University of Medicine and Pharmacy “Gr. T. Popa” Iasi Department of Mathematics and Informatics Gabriel Dimitriu University.
Chapter 9 Risk Management of Energy Derivatives Lu (Matthew) Zhao Dept. of Math & Stats, Univ. of Calgary March 7, 2007 “ Lunch at the Lab ” Seminar.
A 1/n strategy and Markowitz' problem in continuous time Carl Lindberg
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
1 Lecture 16: Point Estimation Concepts and Methods Devore, Ch
V.A. Babaitsev, A.V. Brailov, V.Y. Popov On Niedermayers' algorithm of efficient frontier computing.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
Robert Engle UCSD and NYU and Robert F. Engle, Econometric Services DYNAMIC CONDITIONAL CORRELATIONS.
Extreme Value Theory for High Frequency Financial Data Abhinay Sawant April 20, 2009 Economics 201FS.
Value at Risk Chapter 20 Options, Futures, and Other Derivatives, 7th International Edition, Copyright © John C. Hull 2008.
Machine Learning 5. Parametric Methods.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition Objectives: Reestimation Equations Continuous Distributions Gaussian Mixture Models EM Derivation of Reestimation Resources:
Computacion Inteligente Least-Square Methods for System Identification.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Estimating standard error using bootstrap
The simple linear regression model and parameter estimation
Department of Mathematics
Deep Feedforward Networks
Part 5 - Chapter
Probability Theory and Parameter Estimation I
Singular Value Decomposition
Market Risk VaR: Model-Building Approach
Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae
10701 / Machine Learning Today: - Cross validation,
What is Regression Analysis?
Chapter 3 Statistical Concepts.
5.2 Least-Squares Fit to a Straight Line
Estimation Error and Portfolio Optimization
LECTURE 15: REESTIMATION, EM AND MIXTURES
Probabilistic Surrogate Models
Presentation transcript:

Factor Model Statistical Arbitrage A standard model for the dynamics of stock price is This model can be enhanced by expanding the noise term Where are risk factors associated with the market In discrete time Assume that , , , and that F and are independent.

Covariance of Log Returns If we have n observations and p factors: Or in matrix form Using

Principal Component Analysis Spectra decomposition of matrix where are the Eigen value, Eigen vector pair Noise Reduction We can approximate the model with a limited set of m Eigen vectors or Principal Components Using the largest Eigen vectors will add the components that contribute most to the variance in the data

Stability of Principal Components Comparison of the Stability/Evolution of the PCA 30 day initial data sample– Moved forward one day at a time. 10 largest Eigen cectors compared to the first sample using dot product Two Subtle Problems 1. The Eigen vectors returned by PCA may be the inverse of the first set. 2. Since the Eigen vectors are given in descending order, a change in the relative magnitude of any components may swap their position. Therefore, comparisons must be made carefully. Results Eigen vectors are relatively stable over time. After 10 Eigen vectors they become more unstable.

Stability of Principal Components

Stability of Principal Components

Statistical Distance vs Time of Day Mahanalobis Distance The distance a data point is from the center of the distribution Procedure The training set of 15 minute log return data was for 100 days. The distance of the next 10 data points was calculated. The training set was then shifted forward and the next 10 points measured. The data was sorted by time of day to analyze the time of day that generated the most outliers.

Distance of new Test Data form the Training Data Mahalanobis Distance Conclusion – We can separate the market into two distinct time periods where the returns are generated by two different processes.

Generation of Residuals Partial Least Squares If X is the data set and Y is the component desired to regress from the data then PCA analyzes And PLS analyzes PLS finds the matrix information associated with the first Eigen vector Subtracts this information from the covariance matrix Then finds the information for the second Eigen vector, etc. Procedure Test data : 100 day sample of 15 minute log returns on 500 stocks Predict the next 10 points of data using PLS with largest 9 Eigen vectors Test data moved forward Results Measure of fit

PLS First 45 Minutes of Market Removed

PLS First 45 Minutes of the Market

Calibrating OU Process: Problem Setup Need to estimate κ, μ and σ in the OU-Process Equation: The discrete form of the solution of the SDE can be written as: κ: coefficient of mean reversion ∆: discretization time step μ: long term mean of the residuals

Calibrating OU Process: OLS and MLE Least Squares: Basic idea: Fit parameters by minimizing sum of square of error terms. Maximum Likelihood Estimation: Basic idea: Find parameters by maximizing log-likelihood of the data.

Main Issue OLS and MLE tend to produce similar results. However, MLE is known for overestimating the mean reversion speed κ: example: Johnson, Thomas. “Approximating Optimal Trading Strategies Under Parameter Uncertainty: A Monte Carlo Approach”. Kellog Business School. 2009. Main idea: MLE typically overestimates the mean reversion speed and as a result, underestimates the noise σ. Paper compares filtering trading strategy to MLE. Filtering outperforms MLE every time. Reason: Boguslavsky, Boguslavskaya. “Arbitrage Under Power”. February 2009. MLE model suggests overly aggressive positions that can quickly lead the trader to bankruptcy.

Kalman Filtering Idea: mathematical method to use noisy measurements to produced results that tend to be closer to the true value of the variable of interest.

Comparison of Estimation Methods Parameter estimation by Kalman Filtering Produces produces more accurate estimates of the OU process parameters than either MLE or OLS. Major disadvantage of EM Algorithm: Might take a long time to converge, computationally intensive for large window sizes. Solution: Use MLE/OLS to produce initial guesses then use EM to refine estimation.

Optimal Trading of the Residuals-1 Implement the Boguslavsky/ Boguslavskyaya strategy described in: “Optimal Arbitrage Trading” (2003). O-U process: Conditional Distribution: Utility Function Normalization Process : Let α be the control variable and W the wealth at time t: Value Function:

Optimal Trading of the Residuals-2 Solve for optimal control parameter using HJB equation: Reduces to the PDE: Solution: Let τ be the time left for trading,

Results on EvA residuals ∆ ~ 1 min, γ = -0.5, initial wealth = 100,000 Cumulative Wealth, Optimal Trading Position Peak ~ 4,300,000 End ~ 3,700,000

Results on Our residuals using EvA’s data-XOM ∆ ~ 15 min, initialWealth = 100,000 Cumulative Wealth, γ = 0 Cumulative Wealth,γ = -0.5 Peak ~ 530,000 Peak ~ 520,000 End ~ 490,000 End ~ 450,000

Incorporating TC-Separate Fund Allocation All wealths curves will lie between the red and green curves. Blue curve = no fixed cost peak = 530,000, End = 490,000 Green curve peak = 470,000, end = 420,000 Blue = no cost Green = 10*fixed cost Red = 1*fixed cost

Trading Residuals in Practice Look at historical 15 minute data for ~500 stocks using a 100 days sliding window For every stock i at time t Generate partial least square representation using 10 components using the remaining 499 stocks last 100 days return sliding window Generate a residual return by removing the PLS approximation from the stock return Generate residue replicating portfolio weights Pi = [-β1 –β2 …. -βi-1 1 -βi+1 …. -βn]

Available Data at Time t Stock returns vector R(t) Residuals returns Vector Rresidue(t) Residuals means Vector μresidue(t) Residuals standard deviations Vector σresidue(t) Residuals replication matrix P(t) Pij(t) is the weight of the jth stock in the portfolio replicating ith residue If we have residuals positions vector V(t), the final investment portfolio will be V(t)P(t)

The Trading Strategy Evaluate the market every 15 minutes to look for strong deviations of residuals from mean Enter positions that exceed a entering threshold Leave positions that cross the leaving threshold Allocate money in a certain defined percentage equally between all opportunities invested in given a certain minimum cash position percentage The dynamic rebalancing of portfolio is based on log optimal portfolio growth strategy of volatility pumping

The Secret Sauce: Trading Parameters Long Enter threshold, Short Enter threshold Long Exit Threshold, Short exit threshold Minimum Cash percentage Maximum single position percentage Trading algorithm is robust with trading parameters (at least as far as I tested!) Divided data sets into a training period and used matlab optimization toolbox to find parameters that maximizes sharpe ratio and applied the resulting parameters into a testing period This strategy can be applied continuously to periodically recalibrate the trading parameters