Correlations and Copulas 1
Measures of Dependence 2 The risk can be split into two parts: the individual risks and the dependence structure between them Measures of dependence include: Correlation Rank Correlation Coefficient Tail Dependence Association
Correlation and Covariance The coefficient of correlation between two variables X and Y is defined as The covariance is E(YX)−E(Y)E(X) 3
Independence X and Y are independent if the knowledge of one does not affect the probability distribution for the other where denotes the probability density function 4
Correlation Pitfalls A correlation of 0 is not equivalent to independence If (X, Y ) are jointly normal, Corr(X,Y ) = 0 implies independence of X and Y In general this is not true: even perfectly related RVs can have zero correlation:
Types of Dependence 6 E(Y)E(Y) X E(Y)E(Y) E(Y)E(Y) X (a) (b) (c) X
Correlation Pitfalls (cont.) Correlation is invariant under linear transformations, but not under general transformations: –Example, two log-normal RVs have a different correlation than the underlying normal RVs A small correlation does not imply a small degree of dependency.
Stylized Facts of Correlations Correlation clustering: –periods of high (low) correlation are likely to be followed by periods of high (low) correlation Asymmetry and co-movement with volatility : – high volatility in falling markets goes hand in hand with a strong increase in correlation, but this is not the case for rising markets This reduces opportunities for diversification in stock-market declines.
Monitoring Correlation Between Two Variables X and Y Define x i =(X i −X i-1 )/X i-1 and y i =(Y i −Y i-1 )/Y i-1 Also var x,n : daily variance of X calculated on day n-1 var y,n : daily variance of Y calculated on day n-1 cov n : covariance calculated on day n-1 The correlation is 9
Covariance The covariance on day n is E(x n y n )−E(x n )E(y n ) It is usually approximated as E(x n y n ) 10
Monitoring Correlation continued EWMA: GARCH(1,1) 11
Correlation for Multivariate Case If X is m-dimensional and Y n-dimensional then Cov(X,Y) is given by the m×n-matrix with entries Cov(Xi, Yj ) = Cov(X,Y) is called covariance matrix 12
Positive Finite Definite Condition A variance-covariance matrix, , is internally consistent if the positive semi- definite condition holds for all vectors w 13
Example The variance covariance matrix is not internally consistent. When w=[1,1,-1] the condition for positive semidefinite is not satisfied. 14
Correlation as a Measure of Dependence Correlation as a measure of dependence fully determines the dependence structure for normal distributions and, more generally, elliptical distributions while it fails to do so outside this class. Even within this class correlation has to be handled with care: while a correlation of zero for multivariate normally distributed RVs implies independence, a correlation of zero for, say, t-distributed rvs does not imply independence
Multivariate Normal Distribution Fairly easy to handle A variance-covariance matrix defines the variances of and correlations between variables To be internally consistent a variance- covariance matrix must be positive semidefinite 16
Bivariate Normal PDF Probability density function of a bivariate normal distribution:
X and Y Bivariate Normal Conditional on the value of X, Y is normal with mean and standard deviation where X, Y, X, and Y are the unconditional means and SDs of X and Y and xy is the coefficient of correlation between X and Y 18
Generating Random Samples for Monte Carlo Simulation =NORMSINV(RAND()) gives a random sample from a normal distribution in Excel For a multivariate normal distribution a method known as Cholesky’s decomposition can be used to generate random samples 19
Bivariate Normal PDF independence
Bivariate Normal PDF dependence
Factor Models When there are N variables, V i (i = 1, 2,..N), in a multivariate normal distribution there are N(N−1)/2 correlations We can reduce the number of correlation parameters that have to be estimated with a factor model 22
One-Factor Model continued 23 If U i have standard normal distributions we can set where the common factor F and the idiosyncratic component Z i have independent standard normal distributions Correlation between U i and U j is a i a j
Copulas A powerful concept to aggregate the risks — the copula function — has been introduced in finance by Embrechts, McNeil, and Straumann [1999,2000] A copula is a function that links univariate marginal distributions to the full multivariate distribution This function is the joint distribution function of N standard uniform random variables.
Copulas The dependence relationship between two random variables X and Y is obscured by the marginal densities of X and Y One can think of the copula density as the density that filters or extracts the marginal information from the joint distribution of X and Y. To describe, study and measure statistical dependence between random variables X and Y one may study the copula densities. Vice versa, to build a joint distribution between two random variables X ~G( ) and Y~H( ), one may construct first the copula on [0,1] 2 and utilize the inverse transformation and G -1 ( ) and H -1 ( ).
Cumulative Density Function Theorem Let X be a continuous random variable with distribution function F( ) Let Y be a transformation of X such that Y=F(X). The distribution of Y is uniform on [0,1].
Sklar’s (1959) Theorem- The Bivariate Case X, Y are continuous random variables such that X ~G( · ), Y ~ H( · ) G( · ), H( · ): Cumulative distribution functions – cdf’s Create the mapping of X into X such that X=G(X ) then X has a Uniform distribution on [0,1] This mapping is called the probability integral transformation e.g. Nelsen (1999). Any bivariate joint distribution of (X,Y ) can be transformed to a bivariate copula (X,Y)={G(X ), H(Y )} –Sklar (1959). Thus, a bivariate copula is a bivariate distribution with uniform marginal disturbutions (marginals).
Copula Mathematical Definition A n-dimensional copula C is a function which is a cumulative distribution function with uniform marginals: The condition that C is a distribution function leads to the following properties –As cdfs are always increasing is increasing in each component u i. –The marginal component is obtained by setting u j = 1 for all j i and it must be uniformly distributed, –For ai<bi the probability must be non-negative
An Example Let S i be the value of Stock i. Let V pf be the value of a portfolio 5% Value-at-Risk of a Portfolio is defined as follows: Gaussian Copulas have been used to model dependence between (S 1, S 2, …..,S n )
Copulas Derived from Distributions Typical multivariate distributions describe important dependence structures. The copulas derived can be derived from distributions. The multivariate normal distribution will lead to the Gaussian copula. The multivariate Student t-distribution leads to the t-copula.
Gaussian Copula Models: Suppose we wish to define a correlation structure between two variable V 1 and V 2 that do not have normal distributions We transform the variable V 1 to a new variable U 1 that has a standard normal distribution on a “percentile-to-percentile” basis. We transform the variable V 2 to a new variable U 2 that has a standard normal distribution on a “percentile-to-percentile” basis. U 1 and U 2 are assumed to have a bivariate normal distribution 31
The Correlation Structure Between the V’s is Defined by that Between the U’s V 1 V U 1 U 2 One-to-one mappings Correlation Assumption
Example (page 211) 33 V1V1 V2V2
V 1 Mapping to U 1 34 V1V1 Percentile (probability) U1U Use function NORMINV in Excel to get values in for U 1
V 2 Mapping to U 2 35 V2V2 Percentile (probability) U2U − − Use function NORMINV in Excel to get values in for U 2
Example of Calculation of Joint Cumulative Distribution Probability that V 1 and V 2 are both less than 0.2 is the probability that U 1 < −0.84 and U 2 < −1.41 When copula correlation is 0.5 this is M( −0.84, −1.41, 0.5) = where M is the cumulative distribution function for the bivariate normal distribution 36
Gaussian Copula – algebraic relationship Let G 1 and G 2 be the cumulative marginal probability distributions of V 1 and V 2 Map V 1 = v 1 to U 1 = u 1 so that Map V 2 = v 2 to U 2 = u 2 so that is the cumulative normal distribution function
Gaussian Copula – algebraic relationship U 1 and U 2 are assumed to be bivariate normal The two-dimensional Gaussian copula where is the 2 2 matrix with 1 on the diagonal and correlation coefficient otherwise. denotes the cdf for a bivariate normal distribution with zero mean and covariance matrix . This representation is equivalent to
Bivariate Normal Copula independence 39
Bivariate Normal Copula dependence 40
5000 Random Samples from the Bivariate Normal 41
5000 Random Samples from the Bivariate Student t 42
Multivariate Gaussian Copula We can similarly define a correlation structure between V 1, V 2,…V n We transform each variable V i to a new variable U i that has a standard normal distribution on a “percentile-to- percentile” basis. The U’s are assumed to have a multivariate normal distribution 43
Factor Copula Model In a factor copula model the correlation structure between the U’s is generated by assuming one or more factors. 44
Credit Default Correlation The credit default correlation between two companies is a measure of their tendency to default at about the same time Default correlation is important in risk management when analyzing the benefits of credit risk diversification It is also important in the valuation of some credit derivatives 45
Model for Loan Portfolio We map the time to default for company i, T i, to a new variable U i and assume where F and the Z i have independent standard normal distributions The copula correlation is =a 2 Define Q i as the cumulative probability distribution of T i Prob(U i <U) = Prob(T i <T) when N(U) = Q i (T) 46
Analysis To analyze the model we –Calculate the probability that, conditional on the value of F, U i is less than some value U –This is the same as the probability that T i is less that T where T and U are the same percentiles of their distributions –And –This is also Prob(T i <T|F)
Analysis (cont.) This leads to where PD is the probability of default in time T
The Model continued The worst case default rate for portfolio for a time horizon of T and a confidence limit of X is The VaR for this time horizon and confidence limit is where L is loan principal and R is recovery rate 49
The Model continued 50
Appendix 1: Sampling from Bivariate Normal Distribution
Appendix 2: Sampling from Bivariate t Distribution
Appendix 3: Gaussian Copula with Student t Distribution Sample U 1 and U 2 from a bivariate normal distribution with the given correlation . Convert each sample into a variable with a Student t- distribution on a percentile-to-percentile basis. Suppose that U 1 is in cell C1. The Excel function TINV gives a “two-tail” inverse of the t-distribution. An Excel instruction for determining V 1 is therefore =IF(NORMSDIST(C1)<0.5,TINV(2*NORMSDIST(C1),df),TIN V(2*(1-NORMSDIST(C1)),df)) where df stands for degrees of freedom parameter