Presentation is loading. Please wait.

Presentation is loading. Please wait.

“Teach A Level Maths” Vol. 2: A2 Core Modules

Similar presentations


Presentation on theme: "“Teach A Level Maths” Vol. 2: A2 Core Modules"— Presentation transcript:

1 “Teach A Level Maths” Vol. 2: A2 Core Modules
Another Regression Line © Christine Crisp

2 Statistics 1 AQA EDEXCEL OCR
"Certain images and/or photos on this presentation are the copyrighted property of JupiterImages and are being used with permission under license. These images and/or photos may not be copied or downloaded without permission from JupiterImages"

3 So far, we have calculated the regression line for y on x.
The reason we had to keep using the phrase “ y on x ” is that there are two regression lines. If we want to estimate x for a given y, we use the x on y regression line. It may seem strange to have 2 regression lines depending on which quantity we want to estimate. Previously, in Pure Maths, if we wanted to find x for a given y we just turned the equation around. e.g. If we had we used However, in Statistics we have data spread around a line and we want to estimate with as little uncertainty as possible.

4 For the height and foot length data that we used before, the x on y regression line is given by
The sum of the squares of these lengths is made as small as possible.

5 y on x regression line: x on y regression line:
So, the two regression lines are Foot length and height of UK children Foot length (cm) y on x regression line x on y regression line Height (cm) This point,the point of intersection of the lines, is the mean y on x regression line: x on y regression line: If the length of a child’s foot was 20cm we would use the x on y regression line to estimate the child’s height.

6 We can easily adapt the previous calculations in order to find the least squares regression line for x on y: For y on x we had where and Swapping x and y gives where

7 (c) Comment on your answer to (b).
e.g. The following data gives the weights and lengths of a sample of beans: 1·6 2·1 1·9 2·0 2·2 2·4 2·3 1·7 Length (cm) 0·8 1·0 0·9 1·1 1·2 1·4 0·7 Weight (g) Source: O.N.Bishop (a) Taking the weight to be x and length as y, calculate both least squares regression lines. (b) Use the appropriate line to estimate the weight of a bean of length 1·5 cm. (c) Comment on your answer to (b). Solution: (a) Using the calculator functions for the y on x regression line,

8 1·6 2·1 1·9 2·0 2·2 2·4 2·3 1·7 Length (cm) 0·8 1·0 0·9 1·1 1·2 1·4 0·7 Weight (g) If your calculator doesn’t give the constants for the x on y line, then use the formula booklet as follows: Summary data:

9 1·6 2·1 1·9 2·0 2·2 2·4 2·3 1·7 Length (cm) 0·8 1·0 0·9 1·1 1·2 1·4 0·7 Weight (g) Summary data:

10 1·6 2·1 1·9 2·0 2·2 2·4 2·3 1·7 Length (cm) 0·8 1·0 0·9 1·1 1·2 1·4 0·7 Weight (g) Summary data:

11 The two regression lines look like this:
( y on x ) ( x on y ) Weight and Length of beans

12 (c) Comment on your answer to (b).
1·6 2·1 1·9 2·0 2·2 2·4 2·3 1·7 Length (cm) 0·8 1·0 0·9 1·1 1·2 1·4 0·7 Weight (g) (b) Use the appropriate line to estimate the weight of a bean of length 1·5 cm. We are given y and want to find x so we use the x on y regression line: ( y on x ) ( x on y ) The answer is unreliable as the values lie outside the range of the data. (c) Comment on your answer to (b).

13 SUMMARY There are 2 regression lines:
The y on x regression line is used to estimate y for a given x. The x on y regression line is used to estimate x for a given y. If the data have a high degree of scatter, the regression lines are further apart than for closely clustered data. For data lying entirely on a line, the 2 regression lines coincide. Both regression lines pass through the mean,

14 Exercise 1. The following summary data relates to the population of woodland birds (x) and farmland birds (y) between 1970 and 2002 ( 33 years ). The index for both was taken as 100 in 1970. Source: Social Trends ( from Br. Trust for Ornithology and RSPB ) Summary data: Find the equation of the x on y regression line.

15 Solution: The x on y regression line is given by

16 The full data set together with the x on y regression line looks like this:
Data for 1970 Data for 2002 What do you notice about the data? ANS: Low levels of farmland species occur with low levels of woodland species. ( This doesn’t mean that one causes the other. They could both, for example, be linked to availability of food. ) Only 2 dates are shown but they suggest that both types have declined.

17

18 The following slides contain repeats of information on earlier slides, shown without colour, so that they can be printed and photocopied. For most purposes the slides can be printed as “Handouts” with up to 6 slides per sheet.

19 There are 2 regression lines:
The y on x regression line is used to estimate y for a given x. The x on y regression line is used to estimate x for a given y. If the data have a high degree of scatter, the regression lines are further apart than for closely clustered data. For data lying entirely on a line, the 2 regression lines coincide. Both regression lines pass through the mean, SUMMARY

20 For the height and foot length data that we used before, the x on y regression line is given by
The sum of the squares of these lengths is made as small as possible.

21 We can easily adapt the previous calculations in order to find the least squares regression line for x on y: Swapping x and y gives where For y on x we had where and

22 (c) Comment on your answer to (b).
e.g. The following data gives the weights and lengths of a sample of beans: 1·6 2·1 1·9 2·0 2·2 2·4 2·3 1·7 Length (cm) 0·8 1·0 0·9 1·1 1·2 1·4 0·7 Weight (g) (a) Taking the weight to be x and length as y, calculate both least squares regression lines. (b) Use the appropriate line to estimate the weight of a bean of length 1·5 cm. (c) Comment on your answer to (b). Solution: (a) Using the calculator functions for the y on x regression line, Source: O.N.Bishop

23 If your calculator doesn’t give the constants for the x on y line, then use the formula booklet as follows: Summary data:

24 (b) We are given y and want to find x so we use the x on y regression line:
(c) The answer is unreliable as the values lie outside the range of the data.


Download ppt "“Teach A Level Maths” Vol. 2: A2 Core Modules"

Similar presentations


Ads by Google