Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear Correlation and Regression

Similar presentations


Presentation on theme: "Linear Correlation and Regression"— Presentation transcript:

1 Linear Correlation and Regression
7-8 Statistics Linear Correlation and Regression

2 WHAT YOU WILL LEARN • To learn to determine if Linear Correlation exists • To learn to determine the line of best fit (Linear Regression).

3 Linear Correlation Linear correlation is used to determine whether there is a relationship between two quantities and, if so, how strong the relationship is.

4 Linear Correlation The linear correlation coefficient, r, is a unitless measure that describes the strength of the linear relationship between two variables. If the value is positive, as one variable increases, the other increases. If the value is negative, as one variable increases, the other decreases. The variable, r, will always be a value between –1 and 1 inclusive.

5 Scatter Diagrams A visual aid used with correlation is the scatter diagram, a plot of points (bivariate data). The independent variable, x, generally is a quantity that can be controlled. The dependent variable, y, is the other variable. The value of r is a measure of how far a set of points varies from a straight line. The greater the spread, the weaker the correlation and the closer the r value is to 0. The smaller the spread, the stronger the correlation and the closer the r value is to 1 or -1.

6 Correlation

7 Correlation

8 Linear Correlation Coefficient
The formula to calculate the correlation coefficient (r) is as follows:

9 Example: Words Per Minute versus Mistakes
There are five applicants applying for a job as a medical transcriptionist. The following shows the results of the applicants when asked to type a chart. Determine the correlation coefficient between the words per minute typed and the number of mistakes. 9 34 Nancy 10 41 Kendra 12 53 Phillip 11 67 George 8 24 Ellen Mistakes Words per Minute Applicant

10 Solution We will call the words typed per minute x, and the mistakes y. List the values of x and y and calculate the necessary sums. WPM Mistakes x y x2 y2 xy 24 8 576 64 192 67 11 4489 121 737 53 12 2809 144 636 41 10 1681 100 410 34 9 1156 81 306 x = 219 y = 50 x2 =10,711 y2 = 510 xy = 2,281

11 Solution (continued) The n in the formula represents the number of pieces of data. Here n = 5.

12 Solution (continued)

13 Solution (continued) Since 0.86 is fairly close to 1, there is a fairly strong positive correlation. This result implies that the more words typed per minute, the more mistakes made.

14 Linear Regression Linear regression is the process of determining the linear relationship between two variables. The line of best fit (regression line or the least squares line) is the line such that the sum of the squares of the vertical distances from the line to the data points (on a scatter diagram) is a minimum.

15 The Line of Best Fit Equation:

16 Example Use the data in the previous example to find the equation of the line that relates the number of words per minute and the number of mistakes made while typing a chart. Graph the equation of the line of best fit on a scatter diagram that illustrates the set of bivariate points.

17 Solution From the previous results, we know that

18 Solution Now we find the y-intercept, b.
Therefore the line of best fit is y = 0.081x

19 Solution (continued) To graph y = 0.081x , plot at least two points and draw the graph. 8.882 30 8.072 20 7.262 10 y x

20 Solution (continued)

21 Or... Press STAT then ENTER Enter x-values in L1 Enter y-values in L2
Select CALC Select 8:LinReg(a+bx) ENTER Select L1,L2 Troubleshooting Tips L1 or L2 Missing: Press STAT Select 5:SetUpEditor ENTER r or r2 Missing: Press 2nd 0(CATALOG) Press DiagnosticOn Press ENTER twice

22 The following chart shows the pounds of coffee brewed per day in different sized coffee shops.
(in square yards) Pounds of Coffee Brewed 30 5 44 9 57 18 66 23 106 31

23 Pounds of coffee brewed
Here’s the scatter diagram for that data. Determine whether you believe that a correlation exists between the size of a coffee shop and the pounds of coffee brewed daily. Pounds of coffee brewed Size (in square yards) a. Yes b. No c. Can’t determine

24 Pounds of coffee brewed
Here’s the scatter diagram for that data. Determine whether you believe that a correlation exists between the size of a coffee shop and the pounds of coffee brewed daily. Pounds of coffee brewed Size (in square yards) a. Yes b. No c. Can’t determine

25 Here’s the data, again. Determine the correlation coefficient between the size of a coffee shop and the pounds of coffee brewed daily. Size 30 44 57 66 106 Pounds 5 9 18 23 31 a. ≈ 0.037 c. ≈ 0.963 b. ≈ –0.963 d. ≈ 0.927

26 Here’s the data again. Determine the correlation coefficient between the size of a coffee shop and the pounds of coffee brewed daily. Size 30 44 57 66 106 Pounds 5 9 18 23 31 a. ≈ 0.037 c. ≈ 0.963 b. ≈ –0.963 d. ≈ 0.927

27 Here’s the data, again. Determine the equation of the line of best fit between size of a coffee shop and the pounds of coffee brewed daily. Size 30 44 57 66 106 Pounds 5 9 18 23 31 a. c. b. d.

28 Here’s the data, again. Determine the equation of the line of best fit between size of a coffee shop and the pounds of coffee brewed daily. Size 30 44 57 66 106 Pounds 5 9 18 23 31 a. c. b. d.

29 Use the equation y = 0.35x – 4.01 to predict the pounds of coffee brewed daily in a coffee shop that is 95 square yards. a b c d

30 Use the equation y = 0.35x – 4.01 to predict the pounds of coffee brewed daily in a coffee shop that is 95 square yards. a b c d


Download ppt "Linear Correlation and Regression"

Similar presentations


Ads by Google