Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age.

Similar presentations


Presentation on theme: "Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age."— Presentation transcript:

1 Linear Regression and Correlation Topic 18

2 Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age – risk of accident.  Gender – time spent shopping  Car price – depends on age (of car)  Sales – depend on Marketing

3 Crickets and Temperature  Crickets make their chirping sounds by rapidly sliding one wing over the other.  The faster they move their wings, the higher the chirping sound that is produced.

4 Crickets and Temperature

5 Analysing the data  First graph the data using the XY (Scatter) option

6 Analysing the data  Then right click on one of the data points and select – Add Trendline

7 Analysing the data  Select the Linear Regression type

8 Analysing the data Now right click on the Trendline and select Format Trendline then select Options – finally select Display equation on Chart

9 Analysing the data We can now predict the Temperature.

10 Line of Best Fit You can see differences between the Measured Values and the Calculated values – why?

11 Mean Squared Error (MSE)  The mean squared error or MSE of an estimator is the expected value of the square of the "error."  The error is the amount by which the estimator differs from the quantity to be estimated.  The difference occurs because of randomness  or because the estimator doesn't account for information that could produce a more accurate estimate.

12 Root Mean Square Error  The root mean square error (RMSE) is a frequently-used measure of the difference between values predicted by a model and the values actually observed from the thing being modelled or estimated.  The lower the value of the RMSE the better the fit of observed to calculated data.

13 RMSE

14 Stating the Error  For our Crickets we could then say:  Temperature Y = 1.8635X – 3.7532  Where X is the recorded beats per second of the Crickets wings.  Accurate to + or – 2.07 o C

15 Correlation Coefficient  The correlation coefficient is a measure of how well trends in the predicted values follow trends in the actual values.  It is a measure of how well the predicted values from a forecast model "fit" with the real-life data.

16 Correlation Coefficient  The correlation coefficient is a number between 0 and +/- 1.  If there is no relationship between the predicted values and the actual values the correlation coefficient is 0 or very low (the predicted values are no better than random numbers).  As the strength of the relationship between the predicted values and actual values increases, so does the correlation coefficient.  A perfect fit gives a coefficient of +/- 1.0. Thus the higher the correlation coefficient the better.

17 A demonstration  correlation correlation

18 Correlation  Two main methods of calculating correlations are:  Spearman's Rank Correlation Coefficient and  Pearson's or the Product-Moment Correlation Coefficient.

19 Spearman’s Rank Correlation Coefficient  Spearman's Rank Correlation Coefficient  In calculating this coefficient, we use the Greek letter 'rho' or r The formula used to calculate this coefficient is: r = 1 - (6 d2 ) / n(n2 - 1)

20 Pearson's or Product-Moment Correlation Coefficient  The Pearson Correlation Coefficient is denoted by the symbol r. Its formula is based on the standard deviations of the x-values and the y-values:

21 Coefficient of Determination R Squared  Shows the amount of variation in y that depends on x  The version most common in statistics texts is based on an analysis of variance decomposition as follows: SST is the total sum of squares, SSR is the explained sum of squares, and SSE is the residual sum of squares

22 Coefficient of Determination R Squared  Thankfully Excel calculates this for you:


Download ppt "Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age."

Similar presentations


Ads by Google