Marginal and Conditional distributions. Theorem: (Marginal distributions for the Multivariate Normal distribution) have p-variate Normal distribution.

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

STATISTICS Joint and Conditional Distributions
Let X 1, X 2,..., X n be a set of independent random variables having a common distribution, and let E[ X i ] = . then, with probability 1 Strong law.
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Correlation Chapter 6. Assumptions for Pearson r X and Y should be interval or ratio. X and Y should be normally distributed. Each X should be independent.
Correlation and Linear Regression.
Example: Correlation Part 1: Use Tools > Data Analysis Part 2: Use Insert Function f x Issue: The company wants to analyze the relationship between salary.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
The General Linear Model. The Simple Linear Model Linear Regression.
Multivariate distributions. The Normal distribution.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
Probability theory 2010 Order statistics  Distribution of order variables (and extremes)  Joint distribution of order variables (and extremes)
Probability theory 2010 Main topics in the course on probability theory  Multivariate random variables  Conditional distributions  Transforms  Order.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Correlation. The sample covariance matrix: where.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
Maximum Likelihood Estimation
Inference for the mean vector. Univariate Inference Let x 1, x 2, …, x n denote a sample of n from the normal distribution with mean  and variance 
Profile Analysis. Definition Let X 1, X 2, …, X p denote p jointly distributed variables under study Let  1,  2, …,  p denote the means of these variables.
CORRELATION & REGRESSION
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Linear Regression Hypothesis testing and Estimation.
Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …).
Biostatistics Lecture 17 6/15 & 6/16/2015. Chapter 17 – Correlation & Regression Correlation (Pearson’s correlation coefficient) Linear Regression Multiple.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.
Chapter 6 Simple Regression Introduction Fundamental questions – Is there a relationship between two random variables and how strong is it? – Can.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Correlation and Linear Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Probability and statistics review ASEN 5070 LECTURE.
Geology 6600/7600 Signal Analysis 02 Sep 2015 © A.R. Lowry 2015 Last time: Signal Analysis is a set of tools used to extract information from sequences.
STATISTICS Joint and Conditional Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Hypothesis testing and Estimation
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Functions of Random Variables
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
© The McGraw-Hill Companies, Inc., Chapter 10 Correlation and Regression.
Geology 6600/7600 Signal Analysis 04 Sep 2014 © A.R. Lowry 2015 Last time: Signal Analysis is a set of tools used to extract information from sequences.
Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
STATISTICS Joint and Conditional Distributions
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
PRODUCT MOMENTS OF BIVARIATE RANDOM VARIABLES
Comparing k Populations
Some Rules for Expectation
Comparing k Populations
Multivariate distributions
Hypothesis testing and Estimation
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
Comparing k Populations
Ch11 Curve Fitting II.
Chapter-1 Multivariate Normal Distributions
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Presentation transcript:

Marginal and Conditional distributions

Theorem: (Marginal distributions for the Multivariate Normal distribution) have p-variate Normal distribution with mean vector and Covariance matrix Then the marginal distribution of is q i -variate Normal distribution (q 1 = q, q 2 = p - q) with mean vector and Covariance matrix

Theorem: (Conditional distributions for the Multivariate Normal distribution) have p-variate Normal distribution with mean vector and Covariance matrix Then the conditional distribution of given is q i -variate Normal distribution with mean vector and Covariance matrix

is called the matrix of partial variances and covariances. is called the partial covariance (variance if i = j) between x i and x j given x 1, …, x q. is called the partial correlation between x i and x j given x 1, …, x q.

is called the matrix of regression coefficients for predicting x q+1, x q+2, …, x p from x 1, …, x q. Mean vector of x q+1, x q+2, …, x p given x 1, …, x q is:

Example: Suppose that Is 4-variate normal with

The marginal distribution of is bivariate normal with The marginal distribution of is trivariate normal with

Find the conditional distribution of given Now and

The matrix of regression coefficients for predicting x 3, x 4 from x 1, x 2.

Thus the conditional distribution of given is bivariate Normal with mean vector And partial covariance matrix

Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS

The first step is to input the data. The data is usually contained in some type of file. 1.Text files 2.Excel files 3.Other types of files

After starting the SSPS program the following dialogue box appears:

If you select Opening an existing file and press OK the following dialogue box appears

Once you selected the file and its type

The following dialogue box appears:

If the variable names are in the file ask it to read the names. If you do not specify the Range the program will identify the Range: Once you “click OK”, two windows will appear

A window containing the output

The other containing the data:

To perform any statistical Analysis select the Analyze menu:

To compute correlations select Correlate then Bivariate To compute partial correlations select Correlate then Partial

for Bivariate correlation the following dialogue appears

the output for Bivariate correlation:

for partial correlation the following dialogue appears

- - - P A R T I A L C O R R E L A T I O N C O E F F I C I E N T S Controlling for.. AGE HT WT CHL ALB CA UA CHL ( 0) ( 178) ( 178) ( 178) P=. P=.082 P=.000 P=.002 ALB ( 178) ( 0) ( 178) ( 178) P=.082 P=. P=.000 P=.101 CA ( 178) ( 178) ( 0) ( 178) P=.000 P=.000 P=. P=.020 UA ( 178) ( 178) ( 178) ( 0) P=.002 P=.101 P=.020 P=. (Coefficient / (D.F.) / 2-tailed Significance) ". " is printed if a coefficient cannot be computed the output for partial correlation:

Compare these with the bivariate correlation:

CHL ALB CA UA CHL ALB CA UA Partial Correlations Bivariate Correlations

In the last example the bivariate and partial correlations were roughly in agreement. This is not necessarily the case in all stuations An Example: The following data was collected on the following three variables: 1.Age 2.Calcium Intake in diet (CAI) 3.Bone Mass density (BMI)

The data

Bivariate correlations

Partial correlations

Scatter plot CAI vs BMI (r = )

3D Plot Age, CAI and BMI

Transformations Theorem Let x 1, x 2,…, x n denote random variables with joint probability density function f(x 1, x 2,…, x n ) Let u 1 = h 1 (x 1, x 2,…, x n ). u 2 = h 2 (x 1, x 2,…, x n ). u n = h n (x 1, x 2,…, x n ).  define an invertible transformation from the x’s to the u’s

Then the joint probability density function of u 1, u 2,…, u n is given by: where Jacobian of the transformation

Example Suppose that x 1, x 2 are independent with density functions f 1 (x 1 ) and f 2 (x 2 ) Find the distribution of u 1 = x 1 + x 2 u 2 = x 1 - x 2 Solving for x 1 and x 2 we get the inverse transformation

The Jacobian of the transformation

The joint density of x 1, x 2 is f(x 1, x 2 ) = f 1 (x 1 ) f 2 (x 2 ) Hence the joint density of u 1 and u 2 is:

Theorem Let x 1, x 2,…, x n denote random variables with joint probability density function f(x 1, x 2,…, x n ) Let u 1 = a 11 x 1 + a 12 x 2 +…+ a 1n x n + c 1 u 2 = a 21 x 1 + a 22 x 2 +…+ a 2n x n + c 2 u n = a n1 x 1 + a n2 x 2 +…+ a nn x n + c n  define an invertible linear transformation from the x’s to the u’s

Then the joint probability density function of u 1, u 2,…, u n is given by: where

Theorem Suppose that The random vector, [x 1, x 2, … x p ] has a p-variate normal distribution with mean vector and covariance matrix  then has a p-variate normal distribution with mean vector and covariance matrix

Theorem Suppose that The random vector, [x 1, x 2, … x p ] has a p-variate normal distribution with mean vector and covariance matrix  then has a p-variate normal distribution with mean vector and covariance matrix

Proof then

since Also and hence QED

Theorem Suppose that The random vector, has a p-variate normal distribution with mean vector and covariance matrix  with mean vector and covariance matrix then has a p-variate normal distribution Let A be a q  p matrix of rank q ≤ p

proof then is invertible. and covariance matrix Let B be a (p - q)  p matrix so that is p–variate normal with mean vector

Thus the marginal distribution of and covariance matrix is q–variate normal with mean vector