Xuhua Xia Slide 1 Correlation Simple correlation –between two variables Multiple and Partial correlations –between one variable and a set of other variables.

Slides:



Advertisements
Similar presentations
MANOVA Mechanics. MANOVA is a multivariate generalization of ANOVA, so there are analogous parts to the simpler ANOVA equations First lets revisit Anova.
Advertisements

Canonical Correlation
Xuhua Xia Slide 1 Correlation Simple correlation –between two variables Multiple and Partial correlations –between one variable and a set of other variables.
Simple Regression Model
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
x y z The data as seen in R [1,] population city manager compensation [2,] [3,] [4,]
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Twelve Multiple Regression and Correlation Analysis GOALS When.
More about Correlations. Spearman Rank order correlation Does the same type of analysis as a Pearson r but with data that only represents order. –Ordinal.
Linear Regression A method of calculating a linear equation for the relationship between two or more variables using multiple data points.
Correlation and Regression Analysis
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multiple Linear Regression Analysis
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Example of Simple and Multiple Regression
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
9/14/ Lecture 61 STATS 330: Lecture 6. 9/14/ Lecture 62 Inference for the Regression model Aim of today’s lecture: To discuss how we assess.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
Regression. Correlation and regression are closely related in use and in math. Correlation summarizes the relations b/t 2 variables. Regression is used.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Copyright © 2007 Pearson Education, Inc. Slide 7-1.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Xuhua Xia Slide 1 MANOVA All statistical methods we have learned so far have only one continuous DV and one or more IVs which may be continuous or categorical.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time and body weight in the males of a mammalian species.
MANOVA Mechanics. MANOVA is a multivariate generalization of ANOVA, so there are analogous parts to the simpler ANOVA equations First lets revisit Anova.
Chapter 7 Relationships Among Variables What Correlational Research Investigates Understanding the Nature of Correlation Positive Correlation Negative.
Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time (Time) and body weight (Wt) in the males of a mammalian.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
© 2007 Pearson Education Canada Slide 3-1 Measurement of Cost Behaviour 3.
Chapter 13 Multiple Regression
Aim: Review for Exam Tomorrow. Independent VS. Dependent Variable Response Variables (DV) measures an outcome of a study Explanatory Variables (IV) explains.
Xuhua Xia Stepwise Regression Y may depend on many independent variables How to find a subset of X’s that best predict Y? There are several criteria (e.g.,
Analysis of Covariance (ANCOVA)
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Xuhua Xia Correlation and Regression Introduction to linear correlation and regression Numerical illustrations SAS and linear correlation/regression –CORR.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Environmental Modeling Basic Testing Methods - Statistics III.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Linear Models Alan Lee Sample presentation for STATS 760.
SIMPLE LINEAR REGRESSION AND CORRELLATION
LESSON 6: REGRESSION 2/21/12 EDUC 502: Introduction to Statistics.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Using Microsoft Excel to Conduct Regression Analysis.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Multiple Independent Variables POLS 300 Butz. Multivariate Analysis Problem with bivariate analysis in nonexperimental designs: –Spuriousness and Causality.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Lecture 11: Simple Linear Regression
Inference for Least Squares Lines
Modeling in R Sanna Härkönen.
Correlation and regression
Principal Component Analysis Canonical Correlation Cluster Analysis
Regression and Correlation of Data
Presentation transcript:

Xuhua Xia Slide 1 Correlation Simple correlation –between two variables Multiple and Partial correlations –between one variable and a set of other variables Canonical Correlation –between two sets of variables each containing more than one variable. Simple and multiple correlations are special cases of canonical correlation. Multiple: x 1 on x 2 and x 3 Partial: between X and Y with Z being controlled for

Xuhua Xia Slide 2 Review of correlation XZY Compute Pearson correlation coefficients between X and Z, X and Y and Z and Y. Compute partial correlation coefficient between X and Y, controlling for Z (i.e., the correlation coefficient between X and Y when Z is held constant), by using the equation in the previous slide. Run R to verify your calculation: install.packages("ggm") library(ggm) md<-read.table("XYZ.txt",header=T) cor(md) s<-var(md) parcor(s) install.packages("psych") library(psych) smc(s)

Data for canonical correlation Xuhua Xia Slide 3 # First three variables: physical # Last three variables: exercise # Middle-aged men weightwaistpulsechinssitupsjumps

Xuhua Xia Slide 4 Many Possible Correlations With multiple DV’s (say A, B, C) and IV’s (say a, b, c, d, e), there could be many correlation patterns: –Variable A in the DV set could be correlated to variables a, b, c in the IV set –Variable B in the DV set could be correlated to variables c, d in the IV set –Variable C in the DV set could be correlated to variables a, c, e in the IV set With these plethora of possible correlated relationships, what is the best way of summarizing them?

Xuhua Xia Slide 5 Dealing with Two Sets of Variables The simple correlation approach: –For N DV’s and M IV’s, calculate the simple correlation coefficient between each of N DV’s and each of M IV’s, yielding a total of N*M correlation coefficients The multiple correlation approach: –For N DV’s and M IV’s, calculate multiple or partial correlation coefficients between each of N DV’s and the set of M IV’s, yielding a total of N correlation coefficients The canonical correlation Note: All these deal with linear correlations

Correlation matrix Xuhua Xia Slide 6 md<-read.table("Cancor.txt",header=T) attach(md) R<-cor(md) R weight waist pulse chins situps jumps weight waist pulse chins situps jumps

Multiple correlations Slide 7 fit<-lm(weight~chins+situps+jumps);summary(fit) Estimate Std. Error t value Pr(>|t|) (Intercept) e-11 *** chins situps ** jumps Multiple R-squared: , Adjusted R-squared: fit<-lm(waist~chins+situps+jumps);summary(fit) Estimate Std. Error t value Pr(>|t|) (Intercept) e-15 *** chins situps *** jumps Multiple R-squared: , Adjusted R-squared: fit<-lm(pulse~chins+situps+jumps);summary(fit) Estimate Std. Error t value Pr(>|t|) (Intercept) e-07 *** chins situps jumps Multiple R-squared: , Adjusted R-squared:

Multiple correlation Slide 8 fit<-lm(chins~weight+waist+pulse);summary(fit) Estimate Std. Error t value Pr(>|t|) (Intercept) * weight waist * pulse Multiple R-squared: , Adjusted R-squared: fit<-lm(situps~weight+waist+pulse);summary(fit) Estimate Std. Error t value Pr(>|t|) (Intercept) e-06 *** weight waist ** pulse Multiple R-squared: , Adjusted R-squared: fit<-lm(jumps~weight+waist+pulse);summary(fit) Estimate Std. Error t value Pr(>|t|) (Intercept) weight waist pulse Multiple R-squared: , Adjusted R-squared:

Canonical correlation (cc) install.packages("ggplot2") install.packages("Ggally") install.packages("CCA") install.packages("CCP") require(ggplot2) require(GGally) require(CCA) require(CCP) phys<-md[,1:3] exer<-md[,4:6] matcor(phys,exer) cc1<-cc(phys,exer) cc1

cc output [1] $xcoef [,1] [,2] [,3] weight waist pulse $ycoef [,1] [,2] [,3] chins situps jumps canonical correlations raw canonical coefficients matrices: U and V phys*U: raw canonical variates for phys exer*V: raw canonical variates for exer $scores$xscores: standardized canonical variates.

standardized canonical variates $scores$xscores [,1] [,2] [,3] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] [12,] [13,] [14,] [15,] [16,] [17,] [18,] [19,] [20,] $scores$yscores [,1] [,2] [,3] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] [12,] [13,] [14,] [15,] [16,] [17,] [18,] [19,] [20,]

Canonical structure: Correlations $scores$corr.X.xscores [,1] [,2] [,3] weight waist pulse $scores$corr.Y.xscores [,1] [,2] [,3] chins situps jumps $scores$corr.X.yscores [,1] [,2] [,3] weight waist pulse $scores$corr.Y.yscores [,1] [,2] [,3] chins situps jumps correlation between phys variables with CVs_U correlation between exer variables with CVs_U correlation between phys variables with CVs_V correlation between exer variables with CVs_V

Significance: p.asym in CCP vCancor<-cc1$cor # p.asym(rho,N,p,q, tstat = "Wilks|Hotelling|Pillai|Roy") p.asym(vCancor,length(md$weight),3,3, tstat = "Wilks") Wilks' Lambda, using F-approximation (Rao's F): stat approx df1 df2 p.value 1 to 3: to 3: to 3: plt.asym(res,rhostart=1) plt.asym(res,rhostart=2) plt.asym(res,rhostart=3) At least one cancor significant? Significant relationship after excluding cancor 1? Significant relationship after excluding cancor 1 and 2?

Slide 14 Ecology data: Assignment # 24 sites; for each site, record coverage of four species and concentration of four chemicals