Go to Table of Content Correlation
Go to Table of Content Mr.V.K Malhotra, the marketing manager of SP pickles pvt ltd was wondering about the reasons for the decline in the sales of the company pickles for the last two years. He called a meeting of the team to discuss the possible reasons for the decline. The member suggested that it may be worthwhile to list the variables that influence the sales of the pickles. They listed the average price of the pickles sold by them, the competitors average price, consumers income, and the amount spent on advertising. Having done so they were wondering what to do next. How can they determine the important variables influencing the sales of the pickles? What is the relative contribution of these variables in explaining the sales and how they can manipulate these variables to achieve the desired level of sales? 2
Go to Table of Content Purpose of Correlation Correlation determines whether values of one variable are related to another.
Go to Table of Content r = +.3r = +1 Examples of Approximate r Values y x y x y x y x y x r = -1 r = -.6r = 0
Go to Table of Content Scatter Plot Examples y x y x y y x x Linear relationshipsCurvilinear relationships
Go to Table of Content Scatter Plot Examples y x y x y y x x Strong relationshipsWeak relationships (continued)
Go to Table of Content 7 Correlation Coefficient The correlation coefficient computed from the sample data measures the strength and direction of a relationship between two variables. The range of the correlation coefficient is. - 1 to + 1 and is identified by r.
Go to Table of Content 8 Positive and Negative Correlations A positive relationship exists when both variables increase or decrease at the same time. (Weight and height). A negative relationship exist when one variable increases and the other variable decreases or vice versa. (Strength and age).
Go to Table of Content 9 Range of correlation coefficient In case of exact positive linear relationship the value of r is +1. In case of a strong positive linear relationship, the value of r will be close to + 1.
Go to Table of Content 10 Range of correlation coefficient In case of exact negative linear relationship the value of r is –1. In case of a strong negative linear relationship, the value of r will be close to – 1.
Go to Table of Content 11 Range of correlation coefficient In case of a weak relationship the value of r will be close to 0.
Go to Table of Content 12 Range of correlation coefficient In case of nonlinear relationship the value of r will be close to 0.
Go to Table of Content 13 Formula for correlation coefficient The formula to compute a correlation coefficient is: r = [n( xy) – ( x)( y)] / {[n( x 2 ) – ( x) 2 ][n( y 2 ) – ( y) 2 ]} 0.5 Where n is the number of data pairs, x is the independent variable and y the dependent variable.
Go to Table of Content 14 Example for correlation coefficient Let’s do an example. Using the data on age and blood pressure, let’s calculate the x, y, xy, x 2 and y 2. StudentAgeBlood Pressure Age* BP age 2 BP 2 A B C D E F Sum
Go to Table of Content 15 Example for correlation coefficient Substitute in the formula and solve for r: r= {(6*47634)-(345*819)}/{[(6*20399) ][(6*112443) ]} 0.5. r= The correlation coefficient suggests a strong positive relationship between age and blood pressure.
Go to Table of Content 16 Interpretation The correlation is 0.9 There is a strong positive relationship between age and blood pressure
Go to Table of Content 17 Test of Correlation Null hypothesis: correlation is zero Test statistic is t = r [(n-2)/(1-r 2 )] 0.5 The statistic is distributed as Student t distribution with n-2 degrees of freedom Excel does not calculate this statistic and you can manually calculate it
Go to Table of Content Calculation Example Tree Height Trunk Diamete r yxxyy2y2 x2x =321 =73 =3142 =14111 =713
Go to Table of Content Excel Output Excel Correlation Output Tools / data analysis / correlation… Correlation between Tree Height and Trunk Diameter
Go to Table of Content Example: Produce Stores Is there evidence of a linear relationship between tree height and trunk diameter at the.05 level of significance? H 0 : ρ = 0 (No correlation) H 1 : ρ ≠ 0 (correlation exists) =.05, df = = 6
Go to Table of Content Example: Test Solution Conclusion: There is evidence of a linear relationship at the 5% level of significance Decision: Reject H 0 Reject H 0 /2=.025 -t α/2 Do not reject H 0 0 t α/2 /2= d.f. = 8-2 = 6
Go to Table of Content Introduction to Regression Analysis Regression analysis is used to: –Predict the value of a dependent variable based on the value of at least one independent variable –Explain the impact of changes in an independent variable on the dependent variable Dependent variable: the variable we wish to explain Independent variable: the variable used to explain the dependent variable
Go to Table of Content Simple Linear Regression Model Only one independent variable, x Relationship between x and y is described by a linear function Changes in y are assumed to be caused by changes in x
Go to Table of Content Linear Regression y x xixi Slope = b Intercept = a Y=a+bx
Go to Table of Content 25