Download presentation
Presentation is loading. Please wait.
1
Correlation and Regression
2
Correlation Analysis Pearson’s correlation coefficient (r, rho for population) measures the degree to which there is a linear association between two metric variables. Measures the strength of the relationship between two or more variables. Correlation coefficient lies between –1 and + 1 Correlation coefficient is NOT an indicator of causal relationship between variables
3
Positive Linear Correlation
General trend in the plotted points is from bottom left to top right. Negative Linear Correlation General trend in the plotted points is from top left to bottom right. No Linear Correlation No general trend in plotted points, or a non-linear trend. The strength of the linear correlation can be judged by looking at how closely the points approximate a straight line.
4
Scatter diagram
7
Curious?
8
Variable x y xy x2 y2 3.545 30 106.35 900 2.6 32 83.2 6.76 1024 3.245 97.35 3.93 24 94.32 576 3.995 26 103.87 676 3.115 93.45 3.235 33 1089 3.225 27 87.075 729 2.44 37 90.28 5.9536 1369 3.24 103.68 2.29 84.73 5.2441 2.5 34 85 6.25 1156 4.02 104.52 Sums 41.38 398 12388
9
Testing the Significance of the Correlation Coefficient (why?)
Null hypothesis: Ho : ρ equal to 0 Alternative hypothesis: Ha : ρ not equal to 0
10
Output 1
11
Regression Analysis Used to understand the nature of the relationship between two or more variables A dependent or response variable (Y) is related to one or more independent or predictor variables (Xs) Object is to build a regression model relating dependent variable to one or more independent variables (how is y changing with x?) Model can be used to describe, predict, and control variable of interest on the basis of independent variables
12
Simple Linear Regression
Yi = βo + β1 xi + εi Where Y Dependent variable X Independent variable βo Intercept Mean value of dependent variable (Y) when the independent variable (X) is zero
13
Simple Linear Regression (Contd.)
β1 Model parameter Slope that measures change in mean value of dependent variable associated with a one-unit increase in the independent variable εi Error term that describes the effects on Yi of all factors other than value of Xi
14
Regression – Illustrative Example
Let us check whether x is related to y. Calculate point estimate bo and b1 of unknown parameter βo and β1
15
Output linear regression
How do we write the regression equation?
16
Testing the Significance of the Independent Variables
Null Hypothesis There is no linear relationship between the independent & dependent variables Alternative Hypothesis There is a linear relationship between the independent & dependent variables
17
Output linear regression
How do we interpret the results?
18
Coefficient of Determination (R2)
Measure of regression model's ability to predict R2 = SST - SSE SST = SSM = Explained Variation Total Variation
19
Output linear regression
20
Multiple Linear Regression
A linear combination of predictor factors is used to predict the outcome or response factors Involves computation of a multiple linear regression equation More than one independent variable is included in a single linear regression model
21
Evaluating the Importance of Independent Variables
Which of the independent variables has the greatest influence on the dependent variable? Consider the standardized estimate. CAVEAT: Use standardized to compare independent variables within a sample ONLY. Do not use it for across-sample comparisons.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.