Presentation is loading. Please wait.

Presentation is loading. Please wait.

DEPARTMENT OF STATISTICS  What are they?  When should I use them?  How do Excel and GCs handle them?  Why should I be careful with the Nulake text?

Similar presentations


Presentation on theme: "DEPARTMENT OF STATISTICS  What are they?  When should I use them?  How do Excel and GCs handle them?  Why should I be careful with the Nulake text?"— Presentation transcript:

1 DEPARTMENT OF STATISTICS  What are they?  When should I use them?  How do Excel and GCs handle them?  Why should I be careful with the Nulake text? r and R 2 Matt Regan (et al) Department of Statistics The University of Auckland

2 DEPARTMENT OF STATISTICS r – little r – what is it?  r is the correlation coefficient between y and x  r measures the strength of a linear relationship  r is a multiple of the slope

3 DEPARTMENT OF STATISTICS r – when can it be used?  Only use r if the scatter plot is linear  Don’t use r if the scatter plot is non-linear! x y * * * * * * * * ** * * * * * * * * * * r = 0.99

4 DEPARTMENT OF STATISTICS r – what does it tell you?  How close the points in the scatter plot come to lying on the line r = 0.99 x y * * * * * * * * ** * * * * * * * * * * r = 0.57 x y * * * * * * * * * * * * * * * * * * * *

5 DEPARTMENT OF STATISTICS R 2 – big R 2 – what is it?  R 2 is the coefficient of determination  Measures how close the points in the scatter plot come to lying on the fitted line or curve x y * * * * * * * * * * * * * * * * * * * * x y * * * * * * * * ** * * * * * * * * * *

6 DEPARTMENT OF STATISTICS R 2 – big R 2 – when can it be used?  When the scatter plot of y versus x is linear or non-linear x y * * * * * * * * * * * * * * * * * * * * x y * * * * * * * * ** * * * * * * * * * *

7 DEPARTMENT OF STATISTICS R 2 – what does it tell you? xx Dotplot of the y ’s Shows the variation in the y ’s y y ˆ Dotplot of the y ’s Shows the variation in the y ’s ˆ ˆ

8 DEPARTMENT OF STATISTICS R 2 – what does it tell you? x We see some additional variation in the y ’s. The excess is not explained by the model. y ˆ y 2 Variation in y 's ˆ Variation in fitted values Variation in y values Variation in y 's R = = Variation in the y ’s: This amount of variation can be explained by the model ˆ

9 DEPARTMENT OF STATISTICS R 2 – what does it tell you?  When expressed as a percentage, R 2 is the percentage of the variation in Y that our regression model can explain  R 2 near 100%  model fits well  R 2 near 0%  model doesn’t fit well

10 DEPARTMENT OF STATISTICS R 2 – what does it tell you?  90% of the variation in Y is explained by our regression model. x y * * * * * * * * * * * * * * * * * * * * R 2 = 90%

11 DEPARTMENT OF STATISTICS R 2 – pearls of wisdom!  R 2 and r 2 have the same value ONLY when using a linear model  DON’T use R 2 to pick your model  Use your eyes!

12 DEPARTMENT OF STATISTICS R 2 and Excel & Graphics Calculators

13 DEPARTMENT OF STATISTICS Year 13 Statistics and Modelling Workbook – Nulake

14 DEPARTMENT OF STATISTICS 5.0 Statistical Investigation: page 224 – 261

15 DEPARTMENT OF STATISTICS 5.2 The Regression Line – p227 This straight line is called a regression line and we are required to calculate its equation. This is best done with a graphics calculator or spreadsheet programme on a computer. Regression Lines by Inspection You should only attempt to estimate a regression line if you do not have the technology available.

16 DEPARTMENT OF STATISTICS 5.2 The Regression Line – p229 Non-linear Modelling Equations It is possible to use your calculator to find other regression equations other than a straight line. These non-linear models are also built into your calculator. If they result in a coefficient closer to 1 then they are more appropriate than the straight line regression equation.

17 DEPARTMENT OF STATISTICS 5.4 Non-linear Regression Models – p241 Non-linear Regression Models It is possible that the best model for some data is a non- linear regression line (e.g. a curve) such as an exponential or power function. All the technology aids (calculator and spreadsheets programmes) are able to model data with different models and by inspecting the correlation coefficient, the researcher should be able to determine the best model.

18 DEPARTMENT OF STATISTICS 5.4 Non-linear Regression Models – p241

19 DEPARTMENT OF STATISTICS 5.4 Non-linear Regression Models – p241

20 DEPARTMENT OF STATISTICS 5.4 Non-linear Regression Models – p242

21 DEPARTMENT OF STATISTICS Modelling: Drawing Graphs and adding a Trendline using a Spreadsheet – page 337

22 DEPARTMENT OF STATISTICS Modelling: Drawing Graphs and adding a Trendline using a Spreadsheet – page 337

23 DEPARTMENT OF STATISTICS Correlation – What can go wrong? x1y1 x2y2 x3y3 x4y4 108.04 109.14 107.46 86.58 86.95 88.14 86.77 85.76 137.58 138.74 1312.74 87.71 98.81 98.77 97.11 88.84 118.33 119.26 117.81 88.47 149.96 148.1 148.84 87.04 67.24 66.13 66.08 85.25 44.26 43.1 45.39 1912.5 1210.84 129.13 128.15 85.56 74.82 77.26 76.42 87.91 55.68 54.74 55.73 86.89 N11 Mean of x’s9.0 Mean of y’s7.5 Equation of regression liney = 3 + 0.5x Correlation coefficient (r )0.82

24 DEPARTMENT OF STATISTICS Correlation – What can go wrong? 201510 13 12 11 10 9 8 7 6 5 1494 13 12 11 10 9 8 7 6 5 1494 9 8 7 6 5 4 3 94 11 10 9 8 7 6 5 4


Download ppt "DEPARTMENT OF STATISTICS  What are they?  When should I use them?  How do Excel and GCs handle them?  Why should I be careful with the Nulake text?"

Similar presentations


Ads by Google