The Multivariate Normal Distribution, Part I BMTRY 726 5/17/2018
Univariate Normal Recall the density function of the univariate normal We can rewrite this as
Multivariate Normal Distribution We denote the MVN distribution as What is the density function of X?
Multivariate Normal Distribution What is the density function of X?
Multivariate Normal Note, the density does not exist if S is not positive definite = 0 does not exist We will assume that is positive definite for most of the MVN methods we discuss
Multivariate Density Function If we assume that S is positive definite is the square of the generalized distance from x to m. Also called Squared statistical distance of x to m. Squared Mahalanobis distance of x to m Squared standardized distance of x to m
Why Multivariate Normal The MVN distribution makes a good choice in statistics for several reasons Mathematical simplicity Multivariate central limit theorem Many naturally occurring phenomenon approximately exhibit this distribution
Bivariate Normal Example Consider samples from Let’s write out the joint distribution of x1 and x2
Bivariate Normal Example Joint distribution of x1 and x2
Bivariate Normal Example Joint distribution of x1 and x2
Bivariate Normal Example This yields joint distribution of x1 and x2 in the form
Bivariate Normal Example The density if a function of m1, m2, s1, s2, and r The density is well defined if -1 < r < 1 If r = 0, then …
Examples of Bivariate Normal
R Code to Play Around with MVN If you want to vary parameters and see what changes… > library(MASS) > S<-matrix(c(1, 0.5, 0.5, 1), 2) > bivn <- mvrnorm(100000, mu = c(0, 0), Sigma = S) > > # now we do a kernel density estimate > bivn.kde <- kde2d(bivn[,1], bivn[,2], n = 50) > persp(bivn.kde, phi = 45, theta = 30, xlab="X1", ylab="X2", zlab="phi") > # alternatively, use a “fancy” perspective > persp(bivn.kde, phi = 45, theta = 30, shade = .2, border = NA)
Contours of constant density What if we take a slice of this bivariate distribution at a constant height? i.e.
Contours of constant density The density is constant for all points for which This is an equation for an ellipse centered at
Contours of constant density What happens when
Bivariate Normal Example Let’s look at an example of the bivariate normal when we vary some of the parameters…
Examples X2 X2 X1 X1 X2 X2 X1 X1
Contours of constant density What about when
Contours of constant density How do we find the axes of the ellipse? Axes are in the direction of the eigenvectors of S-1 Axes lengths are proportional to the reciprocals of the square root of the eigenvalues of S-1 We can get these from S (avoid calculating S-1) Let’s look at this for the bivariate case... We must find the eigenvalues and eigenvectors for S Eigenvalues: Eigenvectors:
Eigenvalues of S:
Eigenvalues of S:
The corresponding eigenvector, e1, of S:
The corresponding eigenvector, e1, of S:
Similarly we can find e2, which corresponds to l2 : The axes of the contours of constant density will have length -
If we let then are the eigenvalues of S and e1 and e2 are the corresponding eigenvectors
The ratio of the lengths of the axes The actual lengths depend on the contour being considered. For the (1-a)x100% contour, the ½ lengths are given by Thus the solid ellipsoid of x values satisfying has probability 1-a. -
Univariate case: length of the interval containing the central 95% of the population is proportional to s Bivariate case: the area of the region containing 95% of the population is proportional to .
The “area” of this smallest ellipse in the 2-D case is: We can call this “smallest” region the central (1-a)x100% of the multivariate normal population. The “area” of this smallest ellipse in the 2-D case is: This extends to higher dimensions (think volume) Consider The smallest region for which there is 1-a that a randomly selected observation falls in the region is a p-dimensional ellipsoid centered at m with volume
Visual of the 3-dimensional case
Why Multivariate Normal Recall, statisticians like the MVN distribution because… Mathematically simple Multivariate central limit theorem applies Natural phenomena are often well approximated by a MVN distribution So what are some “fun” mathematical properties that make is so nice?
Properties of MVN Result 4.2: If then has a univariate normal distribution with mean and variance
Example
Properties of MVN Result 4. 3: Any linear transformation of a multivatiate normal random vector has a normal distribution So if and and B is a k x p matrix of constants then
Result 4.3 In the development of univariate statistics, we often transform our observed r.v. to the canonical form: Result 4.3 is of particular interest because we can use it to derive the canonical form of the multivariate normal… But first, we need to consider a specific matrix decomposition
Spectral Decomposition Given S is a non-negative definite, symmetric, real matrix, then S can be decomposed according to: Where the eigenvalues are The eigenvectors of S are e1, e2,...,ep And these satisfy the expression
Marginal Distributions Result 4.4: Consider subsets of Xi’s in X. These subsets are also distributed (multivariate) normal. If Then the marginal distributions of X1 and X2 is:
Example Consider , find the marginal distribution of the 1st and 3rd components
Example Consider , find the marginal distribution of the 1st and 3rd components
Marginal Distributions cont’d The converse of result 4.4 is not always true, an additional assumption is needed. Result 4.5(c): If… and X1 is independent of X2 then
Result 4. 5(a): If X1(qx1) and X2(p-qx1) are independent then Cov(X1,X2) = 0 (b) If Then X1(qx1) and X2(p-qx1) are independent iff
Example
Next Time More properties of the MVN and sampling from MVN…