Presentation is loading. Please wait.

Presentation is loading. Please wait.

Video Conference 1 AS 2013/2012 Chapters 10 – Correlation and Regression 15 December 2013 10 am – 11 am Puan Hasmawati Binti Hassan 04-6532285.

Similar presentations


Presentation on theme: "Video Conference 1 AS 2013/2012 Chapters 10 – Correlation and Regression 15 December 2013 10 am – 11 am Puan Hasmawati Binti Hassan 04-6532285."— Presentation transcript:

1 Video Conference 1 AS 2013/2012 Chapters 10 – Correlation and Regression 15 December 2013 10 am – 11 am Puan Hasmawati Binti Hassan hasma@usm.my 04-6532285

2 Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10-2 Regression 10-3 Coefficient of Determination and Standard Error of the Estimate 10-4 Multiple Regression (Optional) 2

3 Objectives JIM 212 After going through this lesson, you should be able to:  Draw a scatter plot for a set of ordered pairs  Compute the correlation coefficient, r  Test the hypothesis: H 0 : ρ = 0 (test the significance of correlation coefficient) 3

4 4 Objectives 1.Draw a scatter plot for a set of ordered pairs. 2.Compute the correlation coefficient. 3.Test the hypothesis H o : ρ = 0. 4.Compute the equation of the regression line. 5.Compute the standard error of the estimate. 6.Find a prediction interval. 7.Be familiar with the concept of multiple regression - determining whether a relationship between two or more numerical or quantitative variables exists.

5 JIM 212 5 Terminology 1.Correlation 2.Independent variable 3.Dependent variable 4.Relationship 5.Simple relationship 6.Multiple relationship 7.Positive relationship 8.Negative relationship 9.Linear relationship 10.Correlation coefficient 11.Prediction

6 JIM 212 6 relationship In addition to hypothesis testing and confidence intervals, inferential statistics involves determining whether a relationship between two or more numerical or quantitative variables exists. Introduction

7 JIM 212 CorrelationCorrelation is a statistical method used to determine whether a linear relationship between variables exists. 7 Introduction (cont…)

8 JIM 212 8 The purpose of this chapter is to answer these questions statistically: 1. Are two or more variables related? 2. If so, what is the strength of the relationship? 3. What type of relationship exists? 4. What kind of predictions can be made from the relationship? Introduction (cont…)

9 JIM 212 9 Introduction (cont…) 1. Are two or more variables related? 2. If so, what is the strength of the relationship? correlation coefficient To answer these two questions, statisticians use the correlation coefficient, a numerical measure to determine whether two or more variables are related and to determine the strength of the relationship between or among the variables.

10 JIM 212 10 Introduction (cont…) 3. What type of relationship exists? There are two types of relationships: simple and multiple. independent variable dependent variable In a simple relationship, there are two variables: an independent variable (predictor variable) and a dependent variable (response variable). In a multiple relationship, there are two or more independent variables that are used to predict one dependent variable.

11 JIM 212 11 4. What kind of predictions can be made from the relationship? Predictions are made in all areas and daily. Examples include weather forecasting, stock market analyses, sales predictions, crop predictions, gasoline price predictions, and sports predictions. Some predictions are more accurate than others, due to the strength of the relationship. That is, the stronger the relationship is between variables, the more accurate the prediction is. Introduction (cont…)

12 Both are STATISTICAL METHODS CorrelationCorrelation relationship - to determine whether relationship between variables exists RegressionRegression nature of the relationship - to describe the nature of the relationship between variables (+ or -, linear or nonlinear) Correlation & Regression 12

13 13 The purpose of this chapter is to answer these questions statistically: 1. Are two or more variables related? 2. If so, what is the strength of the relationship? 3. What type of relationship exists? 4. What kind of predictions can be made from the relationship? correlation coefficient simple & multiple all areas and daily

14 JIM 212 independent variable x dependent variable yGraph of ordered pairs (x, y) of numbers consisting of the independent variable x and the dependent variable y. Independent variable?Independent variable? Dependent variable?Dependent variable? Scatter Plots 14

15 JIM 212 Q1(i) Forest Fires and Acres Burned a) Page 549 Ex. 10 – 1 No. 14 Number of fires vs. number of acres burned 15

16 JIM 212 16 Correlation Correlation is a statistical method used to determine whether a linear relationship between variables exists. Correlation

17 JIM 212 17 correlation coefficientThe correlation coefficient computed from the sample data measures the strength and direction of a linear relationship between two variables. Pearson product moment correlation coefficient (PPMC)There are several types of correlation coefficients. The one explained in this section is called the Pearson product moment correlation coefficient (PPMC). sample correlation coefficient is rpopulation correlation coefficient is .The symbol for the sample correlation coefficient is r. The symbol for the population correlation coefficient is . Correlation (cont…)

18 JIM 212 18 The range of the correlation coefficient is from  1 to  1. strong positive linear relationshipIf there is a strong positive linear relationship between the variables, the value of r will be close to  1. strong negative linear relationshipIf there is a strong negative linear relationship between the variables, the value of r will be close to  1. Correlation (cont…)

19 JIM 212 19 Correlation (cont…)

20 JIM 212 o Numerical measure to determine whether two or more variables are linearly linearly related, and strength o to determine the strength of the relationship between or among the variables. Correlation Coefficient 20

21 JIM 212 linear  the strength (strong, weak) and direction (+, -) of a linear relationship between two variables.  r : sample correlation coefficient   : population correlation coefficient  Range: -1 ≤  ≤ 1 **Look at page 540 Figure 10-6 Correlation Coefficient (cont…) 21

22 JIM 212 22 Formula for Correlation Coefficient One of the formula for r : where n is the number of data pairs.

23 1(i) b) Page 549 Ex. 10 – 1 No. 14 JIM 212 23

24 The Significance of the Correlation Coefficient Use hypothesis-testing procedure, in order to make the decision. 3 ways 1. Traditional method 2. P-value method 3. Using Table I in Appendix C JIM 212 24

25 JIM 212 25 In hypothesis testing, one of the following is true: no correlation H 0 :   0 This null hypothesis means that there is no correlation between the x and y variables in the population. significant correlation H 1 :   0 This alternative hypothesis means that there is a significant correlation between the variables in the population. Hypothesis Testing

26 Decision: Reject the null hypothesis, since the test value falls in the critical region.  There is significant linear relationship between the number of forest fires and the number of acres burned. 1(i) (c, d, e) Page 549 Ex. 10 – 1 No. 14 cont... JIM 212 26

27 JIM 212 27 Now try using the other two procedures.

28 10.2 Regression regression line If the value of the correlation coefficient is significant, the next step is to determine the equation of the regression line which is the data’s line of best fit. 28

29 Regression 29 Best fit Best fit means that the sum of the squares of the vertical distance from each point to the line is at a minimum.

30 Regression Line 30

31 31 Q1(ii) Forest Fires and Acres Burned Page 559 Ex. 10 – 2 No. 14

32 32 (Q.1(ii)) Page 559 Ex. 10 – 2 No. 14 cont...

33 33 Number of fires vs. number of acres burned (Q.1(ii)) Page 559 Ex. 10 – 2 No. 14 cont...

34 34

35 Regression line: Q1(iii) Page 574 Ex. 10 – 3 No. 16 (Forest Fires and Acres Burned) (Forest Fires and Acres Burned) 35

36 Q1(iv) Page 574 Ex. 10 – 3 No. 20 (Forest Fires and Acres Burned) (Forest Fires and Acres Burned) 36

37 (Q1(iv)) Page 574 Ex. 10 – 3 No. 20 cont... (Forest Fires and Acres Burned) (Forest Fires and Acres Burned) 37

38 JIM 212 38 Q2(i) State Debt and Per Capita Tax a) Page 549 Ex. 10 – 1 No. 16

39 JIM 212 39 2(i) b) Page 549 Ex. 10 – 1 No. 16

40 JIM 212 40 2(i) (c, d, e) Page 549 Ex. 10 – 1 No. 16 cont... Decision: Do not reject. There is no significant linear relationship between per capita debt and tax.

41 41 Q2(ii) State Debt and Per Capita Tax Page 549 Ex. 10 – 2 No. 16 From the hypothesis testing done, the null hypothesis is not rejected (r is not significant). Therefore, there is no significant linear relationship between state debt and per capita tax. Therefore, no regression should be done.

42 No regression line no prediction??? When r is not significant,......?........ is the best predictor of y. 42 Q2(ii) State Debt and Per Capita Tax Page 549 Ex. 10 – 2 No. 16 (cont...)

43 Standard Error of the Estimate standard error of estimate The standard error of estimate, denoted by s est is the standard deviation of the observed y values about the predicted y' values. The formula for the standard error of estimate is: 43

44 44  Since r is not significant, the standard error should not be calculated. Q2(iii) Page 574 Ex. 10 – 3 No. 18 (State Debt and Per Capita Tax) (State Debt and Per Capita Tax)

45 Prediction Interval 45

46 46  Since r is not significant, the prediction interval should not be calculated. Q1(iv) Page 574 Ex. 10 – 3 No. 22 (State Debt and Per Capita Tax) (State Debt and Per Capita Tax)

47 47 Multiple Regression In multiple regression, there are several independent variables and one dependent variable, and the equation is

48 48 Assumptions for Multiple Regression 1. normality assumption – for any specific value of the independent variable, the values of the y variable are normally distributed. 2. equal-variance assumption - the variances (or standard deviations) for the y variables are the same for each value of the independent variable. 3. linearity assumption - there is a linear relationship between the dependent variable and the independent variables. 4. nonmulticollinearity assumption - the independent variables are not correlated. 5. independence assumption - the values for the y variables are independent.

49 49 Q3. Special Occasion Cakes Page 581 Ex. 10 – 4 No. 8 Page 581 Ex. 10 – 4 No. 8

50 50

51 Thank You 51


Download ppt "Video Conference 1 AS 2013/2012 Chapters 10 – Correlation and Regression 15 December 2013 10 am – 11 am Puan Hasmawati Binti Hassan 04-6532285."

Similar presentations


Ads by Google