Download presentation
Presentation is loading. Please wait.
Published byBethany White Modified over 9 years ago
1
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 1 SVD and LS M.A. Miceli University of Rome I Stats in the Château Jouy-en-Josas August 31 - September 4 2009
2
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 2 Motivations Problems of high dimensionality in estimation: –Rank < actual dimension of the data sets inverse problems –Threholds in accepting variables eases on every dimension, as the number of variables/dimensions increases (ex. Wald test). How the SVD helps in extracting robust correlations between dependent and independent variables: automatic choice of “model”. Why Some evidence in predicting US CPIs indexes Some issues about normalizations
3
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 3 Motivations Given a simultaneous linear system of equations 1.Collapsing dimensionality of the system to its min rank = min [rank(Y), rank (X)], 2.Advantages of SVD w.r.t. Principal Components: PC requires a sqare matrix, e.g. autocorrelation matrix, and ranks the dimensions within that single matrix; SVD ranks the correlations between X and Y dimensions 3.Discretionary possibility of getting rid of some - believed negligible – dimensions: we are interested in getting rid of those dimensions that can be generated by a totally random system of same dimensions (Marchenko-Pastur conditions adapted to a rectangular matrix).
4
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 4 Definition of SVD of a matrix product SVD definition Having two matrices one can write and therefore If T << max(M,N)? No problems
5
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 5 Diagonalizing the LS estimator Consider regressing every column y over the set of explanatory variables X: we write We diagonalize both matrices: (X’X) and (X’Y): –X’X –X’Y rectangular –NB. The SVD of a square matrix IS the same as the diagonalisation. We will write
6
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 6
7
7 (X’ Y) Uxy 0 Sxy Vxy SVD of the covariance matrix
8
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 8 X’Y Vxy Uxy Sxy 0 SVD mapping from column basis to row basis
9
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 9 Y Vxy X Uxy Sxy Y linear combin X linear combin SVD: splitting the product X’Y
10
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 10 Adding diagonalisation of both X and Y matrices
11
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 11 YXUxx Uxy Inv(Dxx)Sxy Vxy ‘Vyy ’ Returning to the original variables Replacing the old “B”: any advantage??!! We may cancel factors: any criterium?
12
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 12 RMT 1.Marcenko-Pastur conditions compute singular values density and interval limits for square matrices. Bouchaud, Miceli et al (2005) derive them for rectangular matrices. 2.We run exactly the same experiment with purely random generated matrices for “many times”: limits and densities reply the theory
13
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 13 Marcenko-Pastur limits and density
14
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 14 RMT 1.Density and limits do change if we use raw or already diagonalized data. 2.Is this “double diagonalization” worthwhile? singular values are HD0 in standardization, eigenvectors are NOT.
15
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 15 Diagonalized “LS estimator” We may approach the same problem in different ways 1.raw data 2.normalized factors 3.non normalized factors “unfortunately” 3. works best. Why? … Is it because factor normalization changes the ranking of the SVD singular values and this affect eventually the factor selection? NO! Answer at the end …. Very disturbing
16
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 16 Example: Forecasting US CPIs Indexes Time series are mom % changes: Y:= 9 CPIs Indexes, aug83 – apr07 X:= 77 macroeconomic series nov83-apr07 including 3 lags of the Ys. T=282, N=9, M=77, rolling window W=100 or else. n= N/W, m=M/W.
17
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 17 CPIs
18
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 18 Xs
19
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 19 Estimation by Model III
20
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 20
21
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 21 Singular values: Model I – Random generated DATA
22
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 22
23
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 23 Singular values for SVD on raw and random DATA
24
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 24
25
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 25 Estimation by Model II Factors are divided by their own eigenvalue
26
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 26 Singular values: Model II – Data NORMALIZED FACTORS lambda max = 0.934 Lambda min =0.608
27
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 27 lambda max = 0.934 Lambda min =0.608 Singular values: Model II – Random generated NORMALIZED FACTORS Random generated singular values don’t look very differently ….
28
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 28
29
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 29
30
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 30 Singular values for SVD on raw and random FACTORS
31
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 31 Let’s see estimations by Model III
32
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 32 P&L Model III - Factors on raw data
33
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 33 P&L Model III - CPI Indexes (Model of Non Normalized Factors) – In sample With ALL svd factors2 svd factors
34
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 34 Let’s see estimations by Model II (normalized factors)
35
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 35 P&L Model II (Normalized factors) - Factors
36
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 36 P&L Model II (Normalized factors) – CPI’s
37
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 37 Normalized factorsNon normalized factors Example of CPI_comdty estimation
38
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 38 OUT OF SAMPLE Estimation on t=1,…,120 Forecast at fixed coefficients for t= 121, … 282
39
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 39 P&L: Factors (Model II)
40
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 40 Forecast on CPI’s All factors 2 factors only Easier to predict: 1. medical care (since stable), 2. commodities (oil), 3. Transports
41
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 41 Forecasts on Cpi’s Comdty
42
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 42 Conclusions 1
43
M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009 43 Conclusions on the example
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.