EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005

Session 2 Outline Part 1: Correlation and Independence. Part 2: Confidence Intervals. Part 3: Hypothesis Testing. Part 4: Linear Regression.

Today's Topics Bivariate random variables Statistical Independence. Marginal random variables. Conditional random variables. Correlation and Covariance Multivariate Distributions. Random Vectors. Correlation and Covariance Matrixes. Transformations Transformations of a random variable. Transformations of a bivariate random variables. Transformations of a multivariate random variables.

Bivariate Data A common experimental procedure is to control one variable (input) and measure another variable (output). The values of the "input" variable are denoted xi and the values of the "output" variable as yi. An xy-plot of the data points is referred to as a scatter diagram if the data (xi and/or yi) are random. From the scatter diagram a general data trend may be observed that suggests an empirical model. Fitting the data to this model is known as regression analysis. When the appropriate empirical model is a line then the procedure is called simple linear regression.

Bivariate Data (cont.) n xi yi

Bivariate Data (cont.) The line fit equation is where

The slope of the linear regression is related to the sample correlation coefficient The calculation for r can be rewritten as

r has no units. The value of r is bounded by 1. r = 1  the line fits the data perfectly. 0 < r  1  the line has a positive slope. r = 0  there is no line fit. 0 > r  -1  the line has a negative slope. r = -1  the line fits the data perfectly.

Simple Linear Regression

Bivariate Random Variables Consider the case of two random variables X and Y. The joint CDF is denoted FX,Y(x,y) = P(X  x, Y  y). The joint PDF is defined via the joint CDF where Expected value

Statistical Independence X and Y are statistically independent if and only if, Statistical Independence has an effect on the expected value of separable functions of joint random variables

Marginal Random Variables It is often of interest to find the individual CDFs and PDFs when two random variables are not statistically independent. These are known as the marginal CDF and marginal PDF. Marginal CDFs are straightforward, Marginal PDFs are found by "integrating out" y or x,

Conditional Random Variables Conditional CDFs and PDFs can be defined, Rewriting the conditional PDF for X given Y This is just ____________________ for random variables! A similar equation holds for Y given X.

Marginal Random Variables (cont.) Consistent results are obtained if X and Y are independent, FX (x) = FX,Y (x,) = FX (x) FY () = FX (x) Find the marginal PDFs for fX,Y(x,y) = 2 when 0 < x < y < 1 and fX,Y(x,y) = 0 everywhere else. Are X and Y independent? 1 1

Conditional Random Variables (cont.) The definitions of conditional CDFs and PDFs are consistent when X and Y are statistically independent

Bivariate Guassian Random Variables Let X and Y be jointly Gaussian, but not necessarily independent, random variables. The joint PDF is Note:

Bivariate Guassian Random Variables (cont.) The marginal PDF of X ~ N( x,x2) and Y ~ N( y, y2).

Bivariate Guassian Random Variables (cont.) Consider the case that the Gaussian variables are uncorrelated, that is, xy = 0. The joint PDF is then Thus, uncorrelated jointly Gaussian random variables are independent Gaussian random variables. This is a very important exception to the notion that uncorrelated random variables are not also independent random variables.

Correlation and Covariance The correlation between two joint random variables X and Y is defined as E{XY}. The covariance is defined as cov(X,Y) = E{(X - x)(Y -  y)} = E{XY} -  x  y = xy where  x and  y are the means of X and Y, respectively. X and Y are uncorrelated if and only if cov(X,Y) = 0. An equivalent condition is X and Y are uncorrelated if and only if E{XY} = E{X}E{Y}. This is not the same as independence! Two random variables X and Y are said to be orthogonal if and only if E{XY} = 0. Not the same as uncorrelated!

Correlation and Covariance (cont.) Independent random variables are always uncorrelated cov(X,Y) = E{XY} -  x  y = E{X}E{Y} -  x  y= 0 The reverse is generally not true. x y Uncorrelated RV's Independent RVs

Correlation and Covariance (cont.) The correlation coefficient (normalized covariance) is The correlation coefficient is bounded, -1  xy  +1. xy = 0 if X and Y are uncorrelated. xy = 1 means that X and Y are perfectly correlated. xy = -1 means that X and Y are perfectly anti-correlated.

Correlation and Covariance (cont.) Although covariance describes a linear relationship between variables (if it exists), it does not give an indication of non-linear relationships between variables. The above distribution shows a clear relationship between the random variables X and Y, but the covariance is zero! y 0.04 0.04 0.04 0.04 0.02 0.02 0.05 0.05 0.05 0.05 0.20 0.05 0.05 0.05 0.05 0.02 0.02 0.04 0.04 0.04 0.04 x

Multivariate Distributions When more than two random variables are considered, the various distributions and densities are termed multivariate. Joint CDF: FX1,X2,…Xn (x1,x2,…,xn) Joint PDF: fX1,X2,…Xn (x1,x2,…,xn) Conditional CDF: Conditional PDF:

Multivariate Distributions (cont.) Marginal PDF: Expectation: Independence:

Random Vectors Using vector notation is just as useful for random variables as it is in other engineering disciplines. Consider the random vector Define the "vector PDF" The CDF, marginal, and conditionals are similar.

Correlation Matrix Let X be an 1N random vector and Y be a 1M random vector. Then the correlation matrix, , is is known as the autocorrelation matrix.

Covariance Matrix Let X be an 1N random vector and Y be a 1M random vector. Then the covariance matrix is where and are the vector means of and , respectively. It is often more useful or more natural to write

Covariance Matrix Let X be an 1N random vector and Y be a 1M random vector. Then the covariance matrix is where and are the vector means of and , respectively. It is often more useful or more natural to write

Covariance Matrix (cont.) More interesting is the autocovariance matrix, The autocovariance matrix is symmetric because ij = ji . It is often more useful or more natural to write

Covariance Matrix (cont.) Autocovariance matrix for uncorrelated random variables (ij = 0). Covariance matrix for perfectly correlated random variables (ij = 1).

Covariance Matrix (cont.) Consider a random variable Y which is the weighted sum of N independent random variables Xi, …, XN The mean of Y is straight forward The variance is also straight forward

Covariance Matrix (cont.) If the Xi are uncorrelated with different variances, then If the Xi are correlated with different variances, then

Covariance Matrix (cont.) Let Y and b be M  1 vectors, A be an M  N matrix, and X be an N  1 vector, then Y = AX + b has the statistics Usually it is easy to generate X as uncorrelated random variables with unit variances (Cx = identity matrix). To generate Y with a desired autocovariance find the "square root" of Cy =AAT using eigenvector decomposition

Covariance Matrix (cont.) Covariance matrix for uncorrelated variables. Covariance matrix after rotation  rotation for uncorrelated! How compute a sample correlation / covariance.

Gaussian Vector Let the elements of the random vector be mutually Gaussian. The PDF in vector notation is Determinant of  where is the mean and  is the autocovariance matrix of . If the elements are independent / uncorrelated (equivalent for Guassian only!) the inverse is trivial

Bivariate Guassian Random Variables (cont.) Consider the case that the Gaussian variables are uncorrelated, that is, xy = 0. The joint PDF is then Thus, uncorrelated jointly Gaussian random variables are independent Gaussian random variables. This is a very important exception to the notion that uncorrelated random variables are not also independent random variables.

Transformation of RVs Use inverse transformation to show Z = X+Y thing is convolution. Use inverse transformation to show how to use uniform for generating other RVs. See 232 in Papuolis book.

Transformations of Random Variables Many of the continuous random variables from the previous session where defined as non-linear functions of other random variables, e.g., a chi-square random variable is the result from squaring a zero-mean Gaussian random variable. Here is how NOT to transform a random variable Let X ~ exponential, i.e., Define and substitute X = Y 2 into fX(x), But Y should be Rayleigh, !

Transformations of Random Variables (cont.) The reason the "obvious" procedure failed is that the PDF has no meaning outside of an integral! The correct procedure is to transform the CDF and then compute its derivative to get the transformed PDF. Let Y = g(X)  X = g-1(Y) be a one-to-one "mapping", then For X ~ exponential and  X = Y 2 then The scaling factor  looks different from the Rayleigh PDF.

Transformations of Random Variables (cont.) Why is a one-to-one mapping is important? Let X ~ N(0, x2) and apply the transformation Y = X 2 (Y should be chi-square) The above "PDF" does not integrate to 1! Instead, it integrates to ½. What went wrong? Two points from X map into Y

Transformations of Random Variables (cont.) In general, a mapping of X to Y with a function Y = g(X) must be analyzed by dividing g(X) into N monotonic regions (roots) and then summing the PDF contributions from each region The transformation Y = X 2 has two monotonic regions, X < 0 and X  0 (the equality belongs on the right).

Transformations of Bivariate Random Variables The process is identical to that for a random variable except that the derivative operation is replaced with the Jacobian. Let Y1 = g1(X1, X2) and Y2 = g2(X1, X2). The joint PDF fY1Y2(y1,y2) is found with where

Transformations of Bivariate Random Variables (cont.) Example: Let X1 and X2 be zero-mean, independent Gaussian random variables with equal variances. Compute the PDF fR,(r,) of the polar transform First, note that this transform is one-to-one. Second, the PDF of fX1,X2(x1,x2) is Third, the Jacobian is

Transformations of Bivariate Random Variables (cont.) Substituting Thus R ~ Rayleigh with  = x2 and  ~ uniform [0,2]. Moreover, R and  are statistically independent.

Transformations of Bivariate Random Variables (cont.) The process of transformation of random variables has several important and useful results. A random variable U ~ uniform [0,1] can be transformed to any other PDF fX(x) with the transform X = FX-1(U). Exponential: X = -  ln(1 - U). Rayleigh: The only limitation is being able to invert FX(x). A pair of independent, zero-mean, unit variance Gaussian random variables can be generated from X1 = Rcos() and X2 = Rsin() where R is Rayleigh ( = 1) and  is uniform [0,2].

Transformations of Bivariate Random Variables (cont.) Let X1 and X2 be independent random variables and define Y = X1 + X W = X1 The transformation is one-to-one. The Jacobian is Thus fY,W(y,w) = fX1 (w)f X2(y-w). Integrating vs. w

Transformations of Bivariate Random Variables (cont.) Let X1 and X2 be random variables and define Y =

Let X1 and X2 be random variables and define Y = X1 / X W = X2 The transformation is one-to-one. The Jacobian is Thus fY,W(y,w) = fX1,X2(yw,w) and EMIS7300 Fall 2005 Copyright  Dr. John Lipp

Homework Mandatory (answers in the back of the book): EMIS7300 Fall 2005 Copyright  Dr. John Lipp


