Correlation and Regression SCATTER DIAGRAM The simplest method to assess relationship between two quantitative variables is to draw a scatter diagram.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Regression and correlation methods
Lesson 10: Linear Regression and Correlation
Chapter 12 Simple Linear Regression
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship.
Chapter 12 Simple Linear Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
Linear Regression and Correlation
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
SIMPLE LINEAR REGRESSION
Linear Regression and Correlation Analysis
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
REGRESSION AND CORRELATION
SIMPLE LINEAR REGRESSION
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Correlation and Regression Analysis
Simple Linear Regression and Correlation
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
Relationships Among Variables
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 5 Correlation and Regression
Correlation & Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
February  Study & Abstract StudyAbstract  Graphic presentation of data. Graphic presentation of data.  Statistical Analyses Statistical Analyses.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
© The McGraw-Hill Companies, Inc., 2000 Business and Finance College Principles of Statistics Lecture 10 aaed EL Rabai week
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Introduction to Linear Regression
Examining Relationships in Quantitative Research
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Simple Linear Correlation
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Quantitative Methods Simple Regression.
Correlation and Simple Linear Regression
Correlation and Regression
6-1 Introduction To Empirical Models
24/02/11 Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)
Correlation and Simple Linear Regression
Correlation and Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Chapter 14 Inference for Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Correlation and Regression

SCATTER DIAGRAM The simplest method to assess relationship between two quantitative variables is to draw a scatter diagram From this diagram we notice that as age increases there is a general tendency for the BP to increase. But this does not give us a quantitative estimate of the degree of the relationship

CORRELATION COEFFICIENT index of the degree of association The correlation coefficient is an index of the degree of association between two variables. It can also be used for comparing the degree of association in different groups For example, we may be interested in knowing whether the degree of association between age and systolic BP is the same (or different) in males and females ‘r’ The correlation coefficient is denoted by the symbol ‘r’ ‘r’ ranges from -1 to +1 ‘r’ ranges from -1 to +1

High values of one variable tend to occur with high values of the other (and low with low) positive correlation In such situations, we say that there is a positive correlation High values of one variable occur with low values of the other (and vice-versa) negative correlation we say that there is a negative correlation

A NOTE OF CAUTION Correlation coefficient is purely a measure of degree of does not association and does not provide any evidence of a cause-effect relationship It is valid only in the range of values studied Extrapolation of the association may not always be valid Eg.: Age & Grip strength

r measures the degree of linear relationship r = 0 does not necessarily mean that there is no relationship between the two characteristics under study; the relationship could be curvilinear Spurious correlation : The production of steel in UK and population in India over the last 25 years may be highly correlated

r does not give the rate of change in one variable for changes in the other variable Eg: Age & Systolic BP - Males : r = 0.7 Females : r = 0.5 From this one should not conclude that Systolic BP increases at a higher rate among males than females

PROPERTY OF CORRELATION COEFFICIENT CORRELATION COEFFICIENT Correlation coefficient is unaffected by addition / subtraction of a constant or multiplication / division by a constant to all the values of X and Y Corr. Coeff. between X & Y = 0.7,, X+10 & Y-6 = 0.7,, 5X & 2Y = 0.7 If the correlation coefficient between height in inches and weight in pounds is say, 0.6, the correlation coefficient between height in cm and weight on kg will also be 0.6

COMPUTATION OF THE CORRELATION COEFFICIENT Covariance (XY) Sum n = 7

UNIVARIATE REGRESSION Regression : Method of describing the relationship between two variables Use : To predict the value of one variable given the other

SAMPLE DATA SET Patient No.Age (X) Sys BP (Y) BP = Response (dependent) variable; Age = Predicator (independent) variable

REGRESSION MODEL We can perform a “regression of BP on age”, to derive a straight line that gives an estimated value of BP for any given age. The general equation of a linear regression line is Y = a + bX + e Where, a = Intercept b = Regression coefficient e = Statistical error

CALCULATIONS Estimated from the observed values of Age (X) and BP (Y) by least square method b gives the change in Y for a unit change in X a is the value of Y when X = 0, which may not be meaningful always

TEST OF SIGNIFICANCE FOR b Null hypothesis : Test statistic t = Where, The value given under(1) follows a t-distribution with (n-2) df

ASSUMPTIONS 1. The relation between the two variables should be linear 2.The residuals should follow a Normal distribution with zero mean and constant variance

PRECAUTIONS 1. Adequate sample size should be ensured 2.Prediction should be made within the range of the observed values. No extrapolation should be attempted 3.The equation Y = a + bX should not be used to predict X for a given Y 4. Model adequacy should be verified

RESULTS OF REGRESSION ANALYSIS Ind. variable Reg Coeff. SE t P-value Age < Constant R 2 = 93.99%  94% Systolic BP = Age 95% CI for b = b ± 1.96 SE(b) = 1.08 ± 1.96 x 0.08 = (0.92, 1.24)

INTERPRETATIONS 1. Change in age by one year results in a change of 1.08 mm Hg in Sys. BP 2. When age = 0, BP = , which is absurd 3.BP of a 50 year old individual is x 50 =  154 mm Hg 4. 94% of the variation in BP is explained by age alone

MULTIPLE LINEAR REGRESSION The response variable is expressed as a combination of several predictor variables & are regression coefficients for ht. and wt. Indicate the increase in for an increase of 1 cm in ht. and 1 kg in wt., respectively Eg.

LOGISTIC REGRESSION Response variable - Presence or absence of some condition We predict a transformation of the response variable instead of the actual value of the variable Data : Hypertension, Smoking (X 1 ), Obesity(X 2 ) & Snoring (X 3 ) Which of the factors are predictors of hypertension? Logit (p) = X X X 3 The probability can be estimated for any combination of the three variables Also, we can compare the predicated probability for different groups, e.g., Smokers and Non-smokers