© APT 2006 ICA And Hedge Fund Returns Dr. Andrew Robinson APT Program Trading Techniques and Financial Models for Hedge Funds June 27 th, 2007.

Slides:

Advertisements

Similar presentations

Independent Component Analysis

Advertisements

Independent Component Analysis: The Fast ICA algorithm

COMM 472: Quantitative Analysis of Financial Decisions

LECTURE 8 : FACTOR MODELS

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Dimension reduction (2) Projection pursuit ICA NCA Partial Least Squares Blais. “The role of the environment in synaptic plasticity…..” (1998) Liao et.

Color Imaging Analysis of Spatio-chromatic Decorrelation for Colour Image Reconstruction Mark S. Drew and Steven Bergner

Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL

Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.

Dimension reduction (1)

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

STAT 497 APPLIED TIME SERIES ANALYSIS

Independent Component Analysis & Blind Source Separation

Psychology 202b Advanced Psychological Statistics, II April 7, 2011.

Multiple Linear Regression Model

Principal Component Analysis

Independent Component Analysis (ICA)

L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.

Dimensional reduction, PCA

Independent Component Analysis & Blind Source Separation Ata Kaban The University of Birmingham.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Bootstrap in Finance Esther Ruiz and Maria Rosa Nieto (A. Rodríguez, J. Romo and L. Pascual) Department of Statistics UNIVERSIDAD CARLOS III DE MADRID.

Independent Component Analysis (ICA) and Factor Analysis (FA)

A Quick Practical Guide to PCA and ICA Ted Brookings, UCSB Physics 11/13/06.

Topic 3: Regression.

© 2014 CY Lin, Columbia University E6893 Big Data Analytics – Lecture 4: Big Data Analytics Algorithms 1 E6893 Big Data Analytics: Financial Market Volatility.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.

Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.

Survey on ICA Technical Report, Aapo Hyvärinen, 1999.

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

Summarized by Soo-Jin Kim

Independent Components Analysis with the JADE algorithm

Alternative Measures of Risk. The Optimal Risk Measure Desirable Properties for Risk Measure A risk measure maps the whole distribution of one dollar.

Principles of Pattern Recognition

Multivariate Approaches to Analyze fMRI Data Yuanxin Hu.

ECE 8443 – Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Normal Distributions Whitening Transformations Linear Discriminants Resources.

Independent Component Analysis on Images Instructor: Dr. Longin Jan Latecki Presented by: Bo Han.

Independent Component Analysis Zhen Wei, Li Jin, Yuxue Jin Department of Statistics Stanford University An Introduction.

IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.

ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.

Statistical Arbitrage Ying Chen, Leonardo Bachega Yandong Guo, Xing Liu February, 2010.

Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.

A note about gradient descent: Consider the function f(x)=(x-x 0 ) 2 Its derivative is: By gradient descent (If f(x) is more complex we usually cannot.

BioSS reading group Adam Butler, 21 June 2006 Allen & Stott (2003) Estimating signal amplitudes in optimal fingerprinting, part I: theory. Climate dynamics,

Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.

Blind Information Processing: Microarray Data Hyejin Kim, Dukhee KimSeungjin Choi Department of Computer Science and Engineering, Department of Chemical.

PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:

Principal Component Analysis (PCA)

Spectrum Sensing In Cognitive Radio Networks

September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:

Independent Component Analysis Independent Component Analysis.

Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.

Introduction to Independent Component Analysis Math 285 project Fall 2015 Jingmei Lu Xixi Lu 12/10/2015.

An Introduction of Independent Component Analysis (ICA) Xiaoling Wang Jan. 28, 2003.

Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

CSE 4705 Artificial Intelligence

Lectures 15: Principal Component Analysis (PCA) and

LECTURE 11: Advanced Discriminant Analysis

Brain Electrophysiological Signal Processing: Preprocessing

Outlier Processing via L1-Principal Subspaces

Application of Independent Component Analysis (ICA) to Beam Diagnosis

PCA vs ICA vs LDA.

Descriptive Statistics vs. Factor Analysis

A Fast Fixed-Point Algorithm for Independent Component Analysis

Feature space tansformation methods

Lecturer Dr. Veronika Alhanaqtah

Presentation transcript:

© APT 2006 ICA And Hedge Fund Returns Dr. Andrew Robinson APT Program Trading Techniques and Financial Models for Hedge Funds June 27 th, 2007

© APT 2006 Overview of Talk  Preliminaries – Why Use a Factor Model?  Model Specification & It’s consequences  ICA vs. PCA  Simulation-Based Tests  Sample Analysis  Conclusions

© APT 2006 Factor Models: Motivation  In practice, factor models are employed across all asset classes  Why?  Fundamental Issue: Lack of sufficient and reliable data. ‘The Curse of Dimensionality’  Other motivations:  Increased Flexibility in Scenario Analysis / Risk Attribution  Performance Analysis  Stress Testing  Computational Efficiency  Parsimonious Framework for Advanced Analysis  Derivative Pricing  Risk

© APT 2006 Long-short optimization on TOPIX Advantages of Factor Models For VAR: – Short history securities – Best of Breed Stress Testing – Superior correlation forecasts, – and… PREDICTEDREALIZED 99% VAR (Historical) 99% VAR (APT) % OF VIOLATIONS (Historical VAR) % OF VIOLATIONS (APT VAR) %-0.58%6.15%0.00% %-0.76%20.00%0.00% %-0.85%27.69%0.00% %-0.51%9.23%0.00% %-1.51%43.94%0.00% AVERAGE-0.16%-0.84%21.40%0.00% What’s Wrong With Historical VAR? 4

© APT 2006 Theoretical Interpretation Consider the standard expression for the efficient frontier (work in units of unit variance for simplicity) Consider the same expression, but in terms of the eigenvectors

© APT 2006 Model Specification  Consider the following returns-generating process:  Suppose that instead of observing the true latent variables, we observe another variable, namely:  Effect on  depends on assumptions made for dynamics of error term  But in any case:

© APT 2006 The Effects of Measurement Error  Suppose (in general things will be more complicated) that the measurement error is uncorrelated with the true latent variables…  Consider the asymptotic estimates of  :  In this case, one can show that:

© APT 2006 The Effect of Specification Error If a proxy is employed or a variable is dropped from a multivariate regression, all betas will be affected Relationship between omitted variables bias and measurement error. Compare, for simplicity, the vectors of biases in the cases of a single dropped and a single noisy regressor Where

© APT 2006 Extracting “True” Drivers The Arbitrage Pricing Theory But what are the “true” drivers of security market returns? Traditional statistical models employ techniques based on, for example, Principle Components Analysis or Factor Analysis In this case, latent variables have no clear interpretation Can we interpret them?

© APT 2006 Risk Attribution  We have argued that there are some advantages to employing a purely statistical framework.  These include:  The ability to avoid Specification Error  Flexibility for the incorporation of new asset classes/data  But what about interpretation?  Is there a way to express the results in terms of “real-world” variables?  There are many methodologies. Here we mention three:  Standard “Position-Based” Approach (and refinements) based on marginal risks  Transformation methods based on time series data  Transformation methods based on fundamental data

© APT 2006 Coordinate transformations

© APT 2006 Systematic Risk Breakdown

© APT 2006 Ambiguities in Principle Components? Eigenvalue Decomposition of Covariance Matrix: “Whitening” Transformation But under any orthogonal transformation, we have

© APT 2006 The “Cocktail Party” Problem

© APT 2006 Blind Source Separation Can we extract signals based only on the fact that we know they have certain properties? Fat Tails (kurtosis, higher moments) Autocorrelation Yes! Principle Components Analysis Independent Components Analysis Second Order Blind Identification Back and Weigend (1997) were the first to apply ICA to financial data Useful Adjunct to PCA Recent analyses (e.g. Korizis, et. al. (2007)) Have begun to reconsider application of modified, non-independent factorisation algorithms taking into account time series dynamics.

© APT 2006 Independent Component Analysis Technique employed for tackling the so-called ‘cocktail party problem’ Searching for a set of latent variables for whom the joint probability distribution ‘factorises’ Alternatively, we could formulate this in terms of the moments of the latent variables…

© APT 2006 ICA By Non-Gaussianity Κ = 11.3 Κ = 13.8Κ = 6.7 += Can we employ the central limit theorem to our advantage? Consider sums of indepdendent GARCH(1,1) processes

© APT 2006 ICs via Kurtosis? First step, pre-processing or “whitening” to generate z. Consider linear combinations of z, s, such that some measure of non-Gaussianity, namley f(s) is maximized. And optimize by gradient ascent for each signal…

© APT 2006 ICs via Negentropy Consider Entropy of a Random Variable Well known fact that this is maximal for Gaussian variable. As such, maximize “negentropy” Statistically-speaking negentropy provides “optimal” discrimination and detection of NonGaussianity

© APT 2006 Simple GARCH Test Case We generate 1000 returns employing the following two proceses With Variances Defined By and Apply mixing with random matrix M to generate observed signals: Can ICA extract the original signals?

© APT 2006 Return Distributions of Signals

© APT 2006 Distributions of Independent Components

© APT 2006 Scatter Plot

© APT 2006 GARCH Tests Extend this analysis in order to accommodate multiple GARCH series Try five independent GARCH times series and one Gaussian Apply Random Mixing To Generate Observed Signals Sources (Latent)Signals (Observed)

© APT 2006 GARCH Tests Compare Extracted and Original Signals Amplitudes and Signs Arbitrary Sources (Latent)Components (Extracted)

© APT 2006 GARCH Tests Compare Correlations Between Sources and Recovered Components IC1IC2IC3IC4IC5IC6 G G G G G G

© APT 2006 Common Feature Extraction Generate Six Time Series With the g i defined as before and

© APT 2006 Common Feature Extraction N=0.1IC 1IC 2IC 3IC 4IC 5IC 6 g g g N=0.25IC 1IC 2IC 3IC 4IC 5IC 6 g g g

© APT 2006 Common Feature Extraction N=0.5IC 1IC 2IC 3IC 4IC 5IC 6 g g g N=1.0IC 1IC 2IC 3IC 4IC 5IC 6 g g g

© APT 2006 Interpretation of the Mixing Matrix Interpret Mixing Matrix, A, for N=0.5

© APT 2006 ICA-Based Clustering Very recently, there have been proposals for ICA-based clustering Yu & Wu (2006) employ a heuristic-based approach based on thresholded elements of the mixing matrix Instead, we employ the L1 norm: L1 preserves sensitivity to factor rotation! Combine with (standard) clustering algorithms (KMeans, etc…).

© APT 2006 Interpretation of the Mixing Matrix via Clustering Garch #1 Garch #2 Garch #3

© APT 2006 Reality Check Dataset We can confirm the results for a standard “reality check” dataset Whether by linkage-based techniques or K-Means, we obtain “correct” clusters Not employing time series information (in particular autocorrelation) Reduction in dimensionality…

© APT 2006 Does ICA work in all cases? Try regular signals without excess kurtosis – such as sinusoids Sources (Latent)Signals (Observed)

© APT 2006 Time Series-Based Methods Compare Results for two different methods of signal separation ICA-based techniques do not, by default, employ time series information Techniques such as SOBI (Second-Order Blind Identification) do, however. (see Korzis, et al. for a related study!) ICASOBI (AR(2))

© APT 2006 ICA and HFRX Index Returns As a first attempt, can apply ICA to returns of HFRX indices Looking for dependence on particular components (or combinations) LoadingsTime Series (Mar 2000 to Feb 2007)

© APT 2006 ICA-Based Clustering

© APT 2006 AutoCorrelation in HFRX Index Returns Do we see autocorrelation in the returns – is there a time structure in the return or are they temporally independent? ICA Doesn’t Consider Temporal Relations HF AutocorrelationsIC Autocorrelations

© APT 2006 Second-Order Blind Analysis What happens if we explicitly consider autocorrelations? Results for joint diagonalization of order 2 serial correlations… LoadingsTime Series (Mar 2000 to Feb 2007)

© APT 2006 SOBI-Based Clustering

© APT 2006 AutoCorrelation in HFRX Index Returns SOBI performs a joint decomposition based on the autocorrelation structure of the candidate signals HF AutocorrelationsSOBI Autocorrelations

© APT 2006 Conclusions Correct Specification of factor models is vital Interpretation of Statistical Factors and Higher-Order Effects Latent Variable Estimation and Study Blind Source Separation Non-Gaussianity Time-Series Models Future Directions Out of Sample Tests of Higher Moments Simulation / Risk Modelling

© APT 2006 References Hyvarinen, et al., Independent Component Analysis, Wiley, Ney York (2001). A.D. Back and A.S. Weigend, “A first application of independent component analysis to extracting structure from stock returns”, Int. Journal of Neural Systems, Vol. 8, No. 4, Aug. 1997, pp A. Beloucharnai, et al., “A blind source separation technique based on second order statistics”. IEEE Trans. On Signal Processing, Vol. 45, No. 2, pp , (1997). H. C. Wu a; Philip L. H. Yu, “A robust and scalable clustering model for time series via independent component analysis.”, International Journal of Systems Science, Volume 37, No. 13, October 2006, pages Korizis, et. al., Smooth Component Extraction From a Set of Finanacial Financial Data Mixtures, Proceedings of the Fourth IASTED International Conference on Signal Processing, Pattern Recognition, and Applications (2007). E. Keogh and T. Folias, ‘‘The UCR Time Series Data Mining Archive’’, [ Riverside CA. University of California – Computer Science & Engineering Department, 2002