METHODS OF TRANSFORMING NON-POSITIVE DEFINITE CORRELATION MATRICES Katarzyna Wojtaszek student number 1118676 CROSS
I will try to answer questions: How can I estimate correlation matrix when I have data? What can I do if matrices are non-PD? Shrinking method Eigenvalues method Vines method How can we calculate distances between original and transformed matrices? Which method is the best? comparing conclusions
How can I estimate correlation matrix if I have data? I can estimate the correlation matrices from data as follows: 1. I can estimate each off-diagonal element separately
2. I can also estimate whole data together: with i=1,…,s ; j=1,…,n
What can I do when matrices are non- PD? We can use some methods for transforming these matrices to PD correlation matrices using: Shrinking method Eigenvalues method Vines method
How can we calculate distances between original and transformed matrices? There are many methods which we can use to calculate the distance between matrices . In my project I used formula:
1. SHRINKING METHOD Assumptions: linear shrinking Assumptions: Rnxn is given non-PD pseudo correlation matrix is arbitrary correlation matrix Define: ( [0,1]) =R+ (R* - R) is a pseudo correlation matrix.
Idea: find the smallest such that matrix will be PD. Since R is non-PD then the smallest eigenvalue of R is negative , so we have to choose such that will be positive. Hence: And 0 if - / (*-). So we find matrix which is PD matrix given non-PD matrix R.
non-linear shrinking Assumption: Rnxn is given non-PD pseudo correlation matrix Procedure: where f is strictly increasing odd function with f(0)=0 and >0.
I considered the following four functions:
Comparison of the linear and non-linear shrinking methods Rnxn SET OF PD-MATRICES Linear shrinking In
P -orthogonal matrix such that R=PDPT 2.THE EIGENVALUE METHOD. Assumptions: Rnxn non-PD pseudo correlation matrix P -orthogonal matrix such that R=PDPT D matrix which the eigenvalues of R on the diagonal is some constant 0
= where is a diagonal matrix Idea: Replaced negative values in matrix D by . We obtain: R*=PD*PT = where is a diagonal matrix with diagonal elements equal for i=1,2,…,n.
3.VINES METHOD. Rnxn pseudo correlation matrix Idea: Assumptions: Rnxn pseudo correlation matrix Idea: First we have to check if our matrix is PD
If some (-1,1) we change the value V( ) (-1,1)) and recalculate partial correlation using: V( ) =V( ) + We obtain new matrix , witch we have check again.
Let say that we have matrix R4x4 Example Let say that we have matrix R4x4 Very useful is making graphical model 1 2 4 3
Which method is the best? Comparing. Using Matlab I chose randomly 500 non-PD matrices, transformed them and calculated the average distances between non-PD and PD matrices. This table shows us my results. n 3 4 5 6 7 8 9 10 Lin. shrinking 2.7868 4.371 6.7233 9.8977 14.0027 18.4047 23.7102 29.6013 Shrinking f1 0.1388 0.4028 1.1251 2.5161 4.3623 6.76 9.8484 13.8416 Shrinking f2 0.2756 0.9696 2.382 4.6464 8.1327 11.4816 16.3835 20.5501 Shrinking f3 0.1441 0.4589 1.1432 2.5153 4.4483 6.9127 10.176 13.7543 Shrinking f4 0.4091 1.4379 3.3365 5.7357 8.6839 11.7034 15.686 18.9959 Eigenvalues 0.0861 0.2039 0.451 0.913 1.5799 2.3263 3.3845 4.7033 Vines 0.2285 1.2999 3.3251 6.6395 11.3295 17.813 24.7021 34.4963
ILUSTATION: average distance
Conclusions: The reason that the linear shrinking is very bad method is that we shrink all elements by the same relative amount The eigenvalues method performes fast and gives very good results regardless matrices dimensions For the non-linear shrinking method the best choice of the projection function are and