The Multivariate Normal Distribution, Part I

The Multivariate Normal Distribution, Part I
BMTRY 726 5/17/2018

Univariate Normal Recall the density function of the univariate normal
We can rewrite this as

Multivariate Normal Distribution
We denote the MVN distribution as What is the density function of X?

Multivariate Normal Distribution
What is the density function of X?

Multivariate Normal Note, the density does not exist if
S is not positive definite = 0 does not exist We will assume that is positive definite for most of the MVN methods we discuss

Multivariate Density Function
If we assume that S is positive definite is the square of the generalized distance from x to m. Also called Squared statistical distance of x to m. Squared Mahalanobis distance of x to m Squared standardized distance of x to m

Why Multivariate Normal
The MVN distribution makes a good choice in statistics for several reasons Mathematical simplicity Multivariate central limit theorem Many naturally occurring phenomenon approximately exhibit this distribution

Bivariate Normal Example
Consider samples from Let’s write out the joint distribution of x1 and x2

Joint distribution of x1 and x2

This yields joint distribution of x1 and x2 in the form

The density if a function of m1, m2, s1, s2, and r The density is well defined if -1 < r < 1 If r = 0, then …

Examples of Bivariate Normal

R Code to Play Around with MVN
If you want to vary parameters and see what changes… > library(MASS) > S<-matrix(c(1, 0.5, 0.5, 1), 2) > bivn <- mvrnorm(100000, mu = c(0, 0), Sigma = S) > > # now we do a kernel density estimate > bivn.kde <- kde2d(bivn[,1], bivn[,2], n = 50) > persp(bivn.kde, phi = 45, theta = 30, xlab="X1", ylab="X2", zlab="phi") > # alternatively, use a “fancy” perspective > persp(bivn.kde, phi = 45, theta = 30, shade = .2, border = NA)

Contours of constant density
What if we take a slice of this bivariate distribution at a constant height? i.e.

The density is constant for all points for which This is an equation for an ellipse centered at

What happens when

Let’s look at an example of the bivariate normal when we vary some of the parameters…

Examples X2 X2 X1 X1 X2 X2 X1 X1

What about when

How do we find the axes of the ellipse? Axes are in the direction of the eigenvectors of S-1 Axes lengths are proportional to the reciprocals of the square root of the eigenvalues of S-1 We can get these from S (avoid calculating S-1) Let’s look at this for the bivariate case... We must find the eigenvalues and eigenvectors for S Eigenvalues: Eigenvectors:

Eigenvalues of S:

The corresponding eigenvector, e1, of S:

Similarly we can find e2, which corresponds to l2 :
The axes of the contours of constant density will have length -

If we let then are the eigenvalues of S and e1 and e2 are the corresponding eigenvectors

The ratio of the lengths of the axes
The actual lengths depend on the contour being considered. For the (1-a)x100% contour, the ½ lengths are given by Thus the solid ellipsoid of x values satisfying has probability 1-a. -

Univariate case: length of the interval containing the central 95% of the population is proportional to s Bivariate case: the area of the region containing 95% of the population is proportional to

The “area” of this smallest ellipse in the 2-D case is:
We can call this “smallest” region the central (1-a)x100% of the multivariate normal population. The “area” of this smallest ellipse in the 2-D case is: This extends to higher dimensions (think volume) Consider The smallest region for which there is 1-a that a randomly selected observation falls in the region is a p-dimensional ellipsoid centered at m with volume

Visual of the 3-dimensional case

Why Multivariate Normal
Recall, statisticians like the MVN distribution because… Mathematically simple Multivariate central limit theorem applies Natural phenomena are often well approximated by a MVN distribution So what are some “fun” mathematical properties that make is so nice?

Properties of MVN Result 4.2: If then
has a univariate normal distribution with mean and variance

Example

Properties of MVN Result 4. 3: Any linear transformation of a multivatiate normal random vector has a normal distribution So if and and B is a k x p matrix of constants then

Result 4.3 In the development of univariate statistics, we often transform our observed r.v. to the canonical form: Result 4.3 is of particular interest because we can use it to derive the canonical form of the multivariate normal… But first, we need to consider a specific matrix decomposition

Spectral Decomposition
Given S is a non-negative definite, symmetric, real matrix, then S can be decomposed according to: Where the eigenvalues are The eigenvectors of S are e1, e2,...,ep And these satisfy the expression

Marginal Distributions
Result 4.4: Consider subsets of Xi’s in X. These subsets are also distributed (multivariate) normal. If Then the marginal distributions of X1 and X2 is:

Example Consider , find the marginal distribution of the 1st and 3rd components

Marginal Distributions cont’d
The converse of result 4.4 is not always true, an additional assumption is needed. Result 4.5(c): If… and X1 is independent of X2 then

Result 4. 5(a): If X1(qx1) and X2(p-qx1) are independent
then Cov(X1,X2) = 0 (b) If Then X1(qx1) and X2(p-qx1) are independent iff

Example

Next Time More properties of the MVN and sampling from MVN…

The Multivariate Normal Distribution, Part I

Similar presentations

Presentation on theme: "The Multivariate Normal Distribution, Part I"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Multivariate Normal Distribution, Part I

Similar presentations

Presentation on theme: "The Multivariate Normal Distribution, Part I"— Presentation transcript:

Similar presentations

About project

Feedback