Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Multivariate Normal Distribution, Part I

Similar presentations


Presentation on theme: "The Multivariate Normal Distribution, Part I"— Presentation transcript:

1 The Multivariate Normal Distribution, Part I
BMTRY 726 5/17/2018

2 Univariate Normal Recall the density function of the univariate normal
We can rewrite this as

3 Multivariate Normal Distribution
We denote the MVN distribution as What is the density function of X?

4 Multivariate Normal Distribution
What is the density function of X?

5 Multivariate Normal Note, the density does not exist if
S is not positive definite = 0 does not exist We will assume that is positive definite for most of the MVN methods we discuss

6 Multivariate Density Function
If we assume that S is positive definite is the square of the generalized distance from x to m. Also called Squared statistical distance of x to m. Squared Mahalanobis distance of x to m Squared standardized distance of x to m

7 Why Multivariate Normal
The MVN distribution makes a good choice in statistics for several reasons Mathematical simplicity Multivariate central limit theorem Many naturally occurring phenomenon approximately exhibit this distribution

8 Bivariate Normal Example
Consider samples from Let’s write out the joint distribution of x1 and x2

9 Bivariate Normal Example
Joint distribution of x1 and x2

10 Bivariate Normal Example
Joint distribution of x1 and x2

11 Bivariate Normal Example
This yields joint distribution of x1 and x2 in the form

12 Bivariate Normal Example
The density if a function of m1, m2, s1, s2, and r The density is well defined if -1 < r < 1 If r = 0, then …

13 Examples of Bivariate Normal

14 R Code to Play Around with MVN
If you want to vary parameters and see what changes… > library(MASS) > S<-matrix(c(1, 0.5, 0.5, 1), 2) > bivn <- mvrnorm(100000, mu = c(0, 0), Sigma = S) > > # now we do a kernel density estimate > bivn.kde <- kde2d(bivn[,1], bivn[,2], n = 50) > persp(bivn.kde, phi = 45, theta = 30, xlab="X1", ylab="X2", zlab="phi") > # alternatively, use a “fancy” perspective > persp(bivn.kde, phi = 45, theta = 30, shade = .2, border = NA)

15 Contours of constant density
What if we take a slice of this bivariate distribution at a constant height? i.e.

16 Contours of constant density
The density is constant for all points for which This is an equation for an ellipse centered at

17 Contours of constant density
What happens when

18 Bivariate Normal Example
Let’s look at an example of the bivariate normal when we vary some of the parameters…

19 Examples X2 X2 X1 X1 X2 X2 X1 X1

20 Contours of constant density
What about when

21 Contours of constant density
How do we find the axes of the ellipse? Axes are in the direction of the eigenvectors of S-1 Axes lengths are proportional to the reciprocals of the square root of the eigenvalues of S-1 We can get these from S (avoid calculating S-1) Let’s look at this for the bivariate case... We must find the eigenvalues and eigenvectors for S Eigenvalues: Eigenvectors:

22 Eigenvalues of S:

23 Eigenvalues of S:

24 The corresponding eigenvector, e1, of S:

25 The corresponding eigenvector, e1, of S:

26 Similarly we can find e2, which corresponds to l2 :
The axes of the contours of constant density will have length -

27 If we let then are the eigenvalues of S and e1 and e2 are the corresponding eigenvectors

28 The ratio of the lengths of the axes
The actual lengths depend on the contour being considered. For the (1-a)x100% contour, the ½ lengths are given by Thus the solid ellipsoid of x values satisfying has probability 1-a. -

29 Univariate case: length of the interval containing the central 95% of the population is proportional to s Bivariate case: the area of the region containing 95% of the population is proportional to

30 The “area” of this smallest ellipse in the 2-D case is:
We can call this “smallest” region the central (1-a)x100% of the multivariate normal population. The “area” of this smallest ellipse in the 2-D case is: This extends to higher dimensions (think volume) Consider The smallest region for which there is 1-a that a randomly selected observation falls in the region is a p-dimensional ellipsoid centered at m with volume

31 Visual of the 3-dimensional case

32 Why Multivariate Normal
Recall, statisticians like the MVN distribution because… Mathematically simple Multivariate central limit theorem applies Natural phenomena are often well approximated by a MVN distribution So what are some “fun” mathematical properties that make is so nice?

33 Properties of MVN Result 4.2: If then
has a univariate normal distribution with mean and variance

34 Example

35 Properties of MVN Result 4. 3: Any linear transformation of a multivatiate normal random vector has a normal distribution So if and and B is a k x p matrix of constants then

36 Result 4.3 In the development of univariate statistics, we often transform our observed r.v. to the canonical form: Result 4.3 is of particular interest because we can use it to derive the canonical form of the multivariate normal… But first, we need to consider a specific matrix decomposition

37 Spectral Decomposition
Given S is a non-negative definite, symmetric, real matrix, then S can be decomposed according to: Where the eigenvalues are The eigenvectors of S are e1, e2,...,ep And these satisfy the expression

38

39

40

41 Marginal Distributions
Result 4.4: Consider subsets of Xi’s in X. These subsets are also distributed (multivariate) normal. If Then the marginal distributions of X1 and X2 is:

42 Example Consider , find the marginal distribution of the 1st and 3rd components

43 Example Consider , find the marginal distribution of the 1st and 3rd components

44 Marginal Distributions cont’d
The converse of result 4.4 is not always true, an additional assumption is needed. Result 4.5(c): If… and X1 is independent of X2 then

45 Result 4. 5(a): If X1(qx1) and X2(p-qx1) are independent
then Cov(X1,X2) = 0 (b) If Then X1(qx1) and X2(p-qx1) are independent iff

46 Example

47 Next Time More properties of the MVN and sampling from MVN…


Download ppt "The Multivariate Normal Distribution, Part I"

Similar presentations


Ads by Google