Xuhua Xia Correlation and Regression Introduction to linear correlation and regression Numerical illustrations SAS and linear correlation/regression –CORR.

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Experimental design and analyses of experimental data Lesson 2 Fitting a model to data and estimating its parameters.
Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test.
Multiple Regression. Outline Purpose and logic : page 3 Purpose and logic : page 3 Parameters estimation : page 9 Parameters estimation : page 9 R-square.
Linear regression models
EPI 809/Spring Probability Distribution of Random Error.
Simple Linear Regression and Correlation
Creating Graphs on Saturn GOPTIONS DEVICE = png HTITLE=2 HTEXT=1.5 GSFMODE = replace; PROC REG DATA=agebp; MODEL sbp = age; PLOT sbp*age; RUN; This will.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Multiple regression analysis
Chapter 10 Simple Regression.
Statistics for Business and Economics
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Simple Linear Regression Analysis
OLS versus MLE Example YX Here is the data:
Quantitative Business Analysis for Decision Making Simple Linear Regression.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Chapter 7 Forecasting with Simple Regression
General Linear Model. Instructional Materials MultReg.htmhttp://core.ecu.edu/psyc/wuenschk/PP/PP- MultReg.htm.
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression Models
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Chapter Eighteen. Figure 18.1 Relationship of Correlation and Regression to the Previous Chapters and the Marketing Research Process Focus of This Chapter.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Topic 7: Analysis of Variance. Outline Partitioning sums of squares Breakdown degrees of freedom Expected mean squares (EMS) F test ANOVA table General.
Lecture 4 SIMPLE LINEAR REGRESSION.
Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time and body weight in the males of a mammalian species.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Biostat 200 Lecture Simple linear regression Population regression equationμ y|x = α +  x α and  are constants and are called the coefficients.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
Simple Linear Regression ANOVA for regression (10.2)
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Simple Linear Regression. Data available : (X,Y) Goal : To predict the response Y. (i.e. to obtain the fitted response function f(X)) Least Squares Fitting.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
1 Experimental Statistics - week 13 Multiple Regression Miscellaneous Topics.
Experimental Statistics - week 9
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Regression Analysis AGEC 784.
REGRESSION G&W p
Regression Chapter 6 I Introduction to Regression
Relationship with one independent variable
Multiple Linear Regression.
6-1 Introduction To Empirical Models
Relationship with one independent variable
Multiple Linear Regression
Presentation transcript:

Xuhua Xia Correlation and Regression Introduction to linear correlation and regression Numerical illustrations SAS and linear correlation/regression –CORR –REG –GLM Assumptions of linear correlation/regression Model II regression

Xuhua Xia Introduction Correlation –Bivariate correlation –Multiple correlation –Partial correlation –Canonical correlation Regression –Simple regression –Multiple regression –Nonlinear regression ( ) ( ) ( )

Xuhua Xia Regression Coefficient Sum15 10 X Y Change Y to 3, 4, 5, 6, 7 for students to recompute a and b.

Xuhua Xia Least-squares method Least-square estimate of the sample mean

yx ŷabxx Qyyabxx Q a yabxx Q b yabxxxx yabxx yabxxxx yabxx iii ii iii ii iii ii iii ii                               () ( ) [()] [( [()]() [()] [()]() ()                    ynaa y n y yybxxxx yyxxbxx b yyxx xx i i iii iii ii i ; [()]() ()()() ( ) () Least-Square Estimation of Regression Coefficient A trick to simplify the estimation ŷ i

Xuhua Xia Maximum Likelihood Method R. A. Fisher Estimation of proportion of males (p) of a fish species in a pond: Two samples are taken, one with 10 fish with 5 males and other with 12 fish but only 3 males

Xuhua Xia Correlation & Regression Coefficients Sum X Y

Xuhua Xia Regression Coefficient Sum X Y

Xuhua Xia The Beetle Experiment

Xuhua Xia Regression Coefficient

Xuhua Xia X Y Total deviation Explained deviation Unexplained Deviation Partition of variance

Xuhua Xia ANOVA test in regression Perform an ANOVA significance test. Partition of SS in Regression

Xuhua Xia /* Weight loss (in mg) of 9 batches of 25 Tribolium beetles after six days of starvation at nine different humidities*/ data beetle; input Humidity WtLoss cards; ; proc reg; Title ‘Simple linear regression of WtLoss on Humidity’; model WtLoss=Humidity / R CLM alpha = 0.01 CLI ; plot WtLoss *Humidity / conf ; plot WtLoss *Humidity / pred ; plot residual.*Humidity ; run; proc glm; model WtLoss=Humidity; Title ‘Simple linear regression of WtLoss on Humidity’; run; SAS Program Listing

Xuhua Xia Dependent Variable: WTLOSS Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-square Dep Mean Adj R-sq C.V (=100*Root MSE / Mean) Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > |T| INTERCEP HUMIDITY SAS Output

Xuhua Xia X Y Confidence Limits for  MSESS X

Xuhua Xia X Y Confidence Limits for Y MSE SS X nX i - Mean X

WtLoss = Humidity WtLoss Humidity /* 99% CL of predicted means, equivalent to Predicted  t ,df SE (See Eq)*/ plot WtLoss *Humidity / conf ;

WtLoss Humidity /* 99% CL of prediction intervals, equivalent to Predicted  t ,df STD (with n = 1 in Eq) */ plot WtLoss *Humidity / pred ;

Xuhua Xia Regression summary

Xuhua Xia Assumptions The regression model Yi =  +  Xi +  i Assumptions –The error term has a mean = 0, is independent and normally distributed at each value of X, and have the same variance at each value of X (homoscedasticity). –Y is linearly related to X –There is negligible error (e.g., measurement error) for X. (Model II regression)

Xuhua Xia More plot functions data WtLoss; input Humidity WtLoss; cards; ; proc reg; model WtLoss=Humidity / alpha=0.01; plot WtLoss*Humidity / pred; plot residual.*predicted. / symbol='.'; Title ‘Simple linear regression of WtLoss on Humidity’; run;

Xuhua Xia data My3D ; input X Y Z; datalines; ; proc g3d; scatter X*Y=Z; run; 3D Scatter plot

Xuhua Xia Spurious Correlation Liquor Cons N. Church City Size

Xuhua Xia Spurious Correlation data Liquor; input Liquor Church PopSize datalines; ; proc reg; model Liquor = PopSize; run; proc reg; model Liquor = PopSize / NoInt; run; Forcing the intercept through the origin leads to different computation of SS m and SS t which will be sumsq instead of devsq, i.e., One can use the adjusted R 2 to choose the model.