Download presentation
1
“Teach A Level Maths” Statistics 1
Linear Scaling of Regression Data © Christine Crisp
2
Statistics 1 AQA EDEXCEL OCR
"Certain images and/or photos on this presentation are the copyrighted property of JupiterImages and are being used with permission under license. These images and/or photos may not be copied or downloaded without permission from JupiterImages"
3
We may want to change the units that have been used when collecting data.
For example, we may want kilometres instead of miles or kilograms instead of pounds. Sometimes we may simply want to reduce the size of the numbers in data items. In both these cases we talk about scaling or coding the data. When dealing with regression lines, we can alter a regression line to different units without converting the original data.
4
J F M A S O N D Temp (°F), F 40 38 46 55 60 70 65 69 64 58 47 49 Sales
A researcher has the following data giving the average daily maximum temperature for each month in 1980 and the corresponding figures for the sales of milk in a supermarket. J F M A S O N D Temp (°F), F 40 38 46 55 60 70 65 69 64 58 47 49 Sales ( thousands of pints ), y 8 9 5 6 3 4 2 7 10 The y on F regression line is The correlation coefficient is -0·89. Recent data are measured in degrees Celsius and thousands of litres and the researcher wants to compare the sets of results by converting the older ones.
5
The conversion from degrees Fahrenheit to degrees Celsius is
To convert from pints to litres, we must divide by 1· so, if Y is the new variable, As the conversions are both linear, instead of converting all the data we can simply substitute into the regression line. We first need to rearrange both conversion equations.
6
Substituting in Simplifying: As we have only a small amount of data, we can check the effect of the scaling by converting the data and drawing both regression lines.
7
Graphs showing milk sales against temperature
Product Moment Correlation Coefficient (p.m.c.c.) The correlation coefficient is a measure of the spread of the data so is not altered by linear scaling. ( Although the scales are different on the diagrams, we can see that the scatter of the points is unchanged. )
8
e. g. 2 A set of data connects two variables, p and t
e.g.2 A set of data connects two variables, p and t. However, in order to calculate a regression line, the data has been coded using the formulae x = 60t and y = p If the regression line of y on x is y = 3·29 + 4·15x find the equation of the regression line for p on t. Solution: Substitute for y and x:
9
SUMMARY To convert a scaled or coded regression equation, substitute for the variables using the conversion formulae. The product moment correlation coefficient (p.m.c.c.) is not changed by linear scaling.
10
Exercise 1. Data for 2 variables, v and z have been scaled using the formulae If the equation of the resulting y on x regression line is find the equation of the regression line of v on z. Solution: becomes
11
Exercise 2. A company rep. records the distance travelled, m miles, and time taken, t minutes, for 5 journeys. The data were converted to kilometres and hours and are summarised below. The formulae for the conversions are and (b) Use your answer to (a) to find the equation of the regression line of t on m. (a) Find the equation of the y on x regression line, giving the values of the constants correct to 2 d.p. (c) What effect would the conversion have on the product moment correlation coefficient?
12
Solution: (a) The y on x regression line: (b) The equation of the regression line of t on m. becomes so (c) The conversion has no effect on the p.m.c.c.
14
The following slides contain repeats of information on earlier slides, shown without colour, so that they can be printed and photocopied. For most purposes the slides can be printed as “Handouts” with up to 6 slides per sheet.
15
We may want to change the units that have been used when collecting data.
For example, we may want kilometres instead of miles or kilograms instead of pounds. Sometimes we may simply want to reduce the size of the numbers in data items. In both these cases we talk about scaling or coding the data. When dealing with regression lines, we can alter a regression line to different units without converting the original data.
16
10 8 7 3 2 4 6 5 9 Sales ( thousands of pints ), y 49 47 58 64 69 65
A researcher has the following data from 1980 giving the monthly average daily maximum temperature and sales of milk in a supermarket. 10 8 7 3 2 4 6 5 9 Sales ( thousands of pints ), y 49 47 58 64 69 65 70 60 55 46 38 40 Temp (°F), F N M J A S O D F The y on F regression line is The correlation coefficient is -0·89. Recent data are measured in degrees Celsius and thousands of litres and the researcher wants to compare the sets of results by converting the older ones.
17
To convert from pints to litres, we must divide by 1·76 so, if Y is the new variable,
The conversion from degrees Fahrenheit to degrees Celsius is As the conversions are both linear, instead of converting all the data we can simply substitute into the regression line. We first need to rearrange both conversion equations.
18
As we have only a small amount of data, we can check the effect of the scaling by converting the data and drawing both regression lines. Substituting in Simplifying:
19
The correlation coefficient is a measure of the spread of the data so is not altered by linear scaling. ( Although the scales are different on the diagrams, we can see that the scatter of the points is unchanged. ) Product Moment Correlation Coefficient (p.m.c.c.) Graphs showing milk sales against temperature
20
SUMMARY To convert a scaled or coded regression equation, substitute for the variables using the conversion formulae. The product moment correlation coefficient (p.m.c.c.) is not changed by linear scaling.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.