Presentation is loading. Please wait.

Presentation is loading. Please wait.

Generating Correlated Random Variables Kriss Harris Senior Statistician

Similar presentations


Presentation on theme: "Generating Correlated Random Variables Kriss Harris Senior Statistician"— Presentation transcript:

1 Generating Correlated Random Variables Kriss Harris Senior Statistician Kriss.5.Harris@gsk.com

2 Why? I was producing graphs for a SAS Graphics Training Course that will be rolled out soon, and I wanted to control the correlation between the variables. 2

3 Previous Method 3 Use Excel to fill down and then generate another column that was fairly correlated

4 Generating Correlated Random Variables using the SAS Datastep data bivariate_final; mean1=0; *mean for y1; mean2=10; *mean for y2; sig1=2; *SD for y1; sig2=5; *SD for y2; rho=0.90; *Correlation between y1 and y2; do i = 1 to 100; r1 = rannor(1245); r2 = rannor(2923); y1 = mean1 + sig1*r1; y2 = mean2 + rho*sig2*r1+sqrt(sig2**2- sig2**2*rho**2)*r2; output; end; run; 4

5 Generating Correlated Random Variables using the SAS Datastep data bivariate_final; mean1=0; *mean for y1; mean2=10; *mean for y2; sig1=2; *SD for y1; sig2=5; *SD for y2; rho=0.90; *Correlation between y1 and y2; do i = 1 to 100; r1 = rannor(1245); r2 = rannor(2923); y1 = mean1 + sig1*r1; y2 = mean2 + rho*sig2*r1+sqrt(sig2**2- sig2**2*rho**2)*r2; output; end; run; 5

6 Generating Correlated Random Variables using the SAS Datastep data bivariate_final; mean1=0; *mean for y1; mean2=10; *mean for y2; sig1=2; *SD for y1; sig2=5; *SD for y2; rho=0.90; *Correlation between y1 and y2; do i = 1 to 100; r1 = rannor(1245); r2 = rannor(2923); y1 = mean1 + sig1*r1; y2 = mean2 + rho*sig2*r1+sqrt(sig2**2- sig2**2*rho**2)*r2; output; end; run; 6

7 Generating Correlated Random Variables using the SAS Datastep data bivariate_final; mean1=0; *mean for y1; mean2=10; *mean for y2; sig1=2; *SD for y1; sig2=5; *SD for y2; rho=0.90; *Correlation between y1 and y2; do i = 1 to 100; r1 = rannor(1245); r2 = rannor(2923); y1 = mean1 + sig1*r1; y2 = mean2 + rho*sig2*r1+sqrt(sig2**2- sig2**2*rho**2)*r2; output; end; run; 7

8 Generating Correlated Random Variables using the SAS Datastep data bivariate_final; mean1=0; *mean for y1; mean2=10; *mean for y2; sig1=2; *SD for y1; sig2=5; *SD for y2; rho=0.90; *Correlation between y1 and y2; do i = 1 to 100; r1 = rannor(1245); r2 = rannor(2923); y1 = mean1 + sig1*r1; y2 = mean2 + rho*sig2*r1+sqrt(sig2**2- sig2**2*rho**2)*r2; output; end; run; 8

9 Generating Correlated Random Variables using the SAS Datastep data bivariate_final; mean1=0; *mean for y1; mean2=10; *mean for y2; sig1=2; *SD for y1; sig2=5; *SD for y2; rho=0.90; *Correlation between y1 and y2; do i = 1 to 100; r1 = rannor(1245); r2 = rannor(2923); y1 = mean1 + sig1*r1; y2 = mean2 + rho*sig2*r1+sqrt(sig2**2- sig2**2*rho**2)*r2; output; end; run; 9

10 Y and x for different correlation coefficients 10

11 Generating Correlated Random Variables using Proc IML To generate more than 2 correlated random variables than it’s easier to use the Cholesky decomposition method in Proc IML. IML = Interactive Matrix Language 11

12 Generating Correlated Random Variables using Proc IML proc iml; use bivariate_final; read all var {r1} into x3; read all var {r2} into x4; read all var {mean1} into mean1; read all var {mean2} into mean2; x={ 4 9, 9 25}; /* C */ mattrib x rowname=(rows [1:2 ]) colname=(cols [1:2]); Cholesky_decomp = root(x); /* U */ matrix_con = x3||x4; mean = mean1||mean2; final_simulated = mean + matrix_con * Cholesky_decomp; /*RC*/ varnames = {y3 y4}; create Cholesky_correlation from final_simulated (|colname = varnames|); append from final_simulated; quit; 12 Use is similar to set. Reading in the simulated data and the means

13 Generating Correlated Random Variables using Proc IML proc iml; use bivariate_final; read all var {r1} into x3; read all var {r2} into x4; read all var {mean1} into mean1; read all var {mean2} into mean2; x={ 4 9, 9 25}; /* C */ mattrib x rowname=(rows [1:2 ]) colname=(cols [1:2]); Cholesky_decomp = root(x); /* U */ matrix_con = x3||x4; mean = mean1||mean2; final_simulated = mean + matrix_con * Cholesky_decomp; /*RC*/ varnames = {y3 y4}; create Cholesky_correlation from final_simulated (|colname = varnames|); append from final_simulated; quit; 13 Variance covariance matrix

14 Generating Correlated Random Variables using Proc IML proc iml; use bivariate_final; read all var {r1} into x3; read all var {r2} into x4; read all var {mean1} into mean1; read all var {mean2} into mean2; x={ 4 9, 9 25}; /* C */ mattrib x rowname=(rows [1:2 ]) colname=(cols [1:2]); Cholesky_decomp = root(x); /* U */ matrix_con = x3||x4; mean = mean1||mean2; final_simulated = mean + matrix_con * Cholesky_decomp; /*RC*/ varnames = {y3 y4}; create Cholesky_correlation from final_simulated (|colname = varnames|); append from final_simulated; quit; 14 Applying Cholesky’s decompositon

15 Generating Correlated Random Variables using Proc IML proc iml; use bivariate_final; read all var {r1} into x3; read all var {r2} into x4; read all var {mean1} into mean1; read all var {mean2} into mean2; x={ 4 9, 9 25}; /* C */ mattrib x rowname=(rows [1:2 ]) colname=(cols [1:2]); Cholesky_decomp = root(x); /* U */ matrix_con = x3||x4; mean = mean1||mean2; final_simulated = mean + matrix_con * Cholesky_decomp; /*RC*/ varnames = {y3 y4}; create Cholesky_correlation from final_simulated (|colname = varnames|); append from final_simulated; quit; 15 Concatenating the variables

16 Generating Correlated Random Variables using Proc IML proc iml; use bivariate_final; read all var {r1} into x3; read all var {r2} into x4; read all var {mean1} into mean1; read all var {mean2} into mean2; x={ 4 9, 9 25}; /* C */ mattrib x rowname=(rows [1:2 ]) colname=(cols [1:2]); Cholesky_decomp = root(x); /* U */ matrix_con = x3||x4; mean = mean1||mean2; final_simulated = mean + matrix_con * Cholesky_decomp; /*RC*/ varnames = {y3 y4}; create Cholesky_correlation from final_simulated (|colname = varnames|); append from final_simulated; quit; 16 Correlated Variables

17 Generating Correlated Random Variables using Proc IML proc iml; use bivariate_final; read all var {r1} into x3; read all var {r2} into x4; read all var {mean1} into mean1; read all var {mean2} into mean2; x={ 4 9, 9 25}; /* C */ mattrib x rowname=(rows [1:2 ]) colname=(cols [1:2]); Cholesky_decomp = root(x); /* U */ matrix_con = x3||x4; mean = mean1||mean2; final_simulated = mean + matrix_con * Cholesky_decomp; /*RC*/ varnames = {y3 y4}; create Cholesky_correlation from final_simulated (|colname = varnames|); append from final_simulated; quit; 17 Outputting the variables

18 References Generating Multivariate Normal Data by using Proc IML Generating Multivariate Normal Data by using Proc IML Lingling Han, University of Georgia, Athens, GA 18

19 Appendix Correlation Coefficient = 19

20 R Code - Generating Correlated Random Variables mean1 = 0 mean2 = 10 sig1 = 2 sig2 = 5 rho = 0.9 r1 = rnorm(100, 0, 1) r2 = rnorm(100, 0, 1) y1 = mean1 + sig1*r1; y2 = mean2 + rho*sig2*r1+sqrt(sig2**2-sig2**2*rho**2)*r2; 20

21 R Code - Generating Correlated Random Variables mean1 = 0 mean2 = 10 sig1 = 2 sig2 = 5 rho = 0.9 r1 = rnorm(100, 0, 1) r2 = rnorm(100, 0, 1) y1 = mean1 + sig1*r1 y2 = mean2 + rho*sig2*r1+sqrt(sig2**2-sig2**2*rho**2)*r2 21

22 R Code - Generating Correlated Random Variables using Matrices C = matrix(c(4, 9, 9, 25), nrow = 2, ncol = 2) cholc = chol(C) R = matrix(c(r1,r2), nrow = 100, ncol = 2, byrow = F) mean = matrix(c(mean1,mean2), nrow = 100, ncol = 2, byrow = T) RC = mean + R %*% cholc 22 Use previous values of r1 and r2


Download ppt "Generating Correlated Random Variables Kriss Harris Senior Statistician"

Similar presentations


Ads by Google