Transformations
Transformations to Linearity Many non-linear curves can be put into a linear form by appropriate transformations of the either – the dependent variable Y or –some (or all) of the independent variables X 1, X 2,..., X p. This leads to the wide utility of the Linear model. We have seen that through the use of dummy variables, categorical independent variables can be incorporated into a Linear Model. We will now see that through the technique of variable transformation that many examples of non-linear behaviour can also be converted to linear behaviour.
Intrinsically Linear (Linearizable) Curves 1 Hyperbolas y = x/(ax-b) Linear form: 1/y = a -b (1/x) or Y = 0 + 1 X Transformations: Y = 1/y, X=1/x, 0 = a, 1 = -b
2. Exponential y = e x = x Linear form: ln y = ln + x = ln + ln x or Y = 0 + 1 X Transformations: Y = ln y, X = x, 0 = ln , 1 = = ln
3. Power Functions y = a x b Linear from: ln y = lna + blnx or Y = 0 + 1 X
Logarithmic Functions y = a + b lnx Linear from: y = a + b lnx or Y = 0 + 1 X Transformations: Y = y, X = ln x, 0 = a, 1 = b
Other special functions y = a e b/x Linear from: ln y = lna + b 1/x or Y = 0 + 1 X Transformations: Y = ln y, X = 1/x, 0 = lna, 1 = b
Polynomial Models y = 0 + 1 x + 2 x 2 + 3 x 3 Linear form Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 Variables Y = y, X 1 = x, X 2 = x 2, X 3 = x 3
Exponential Models with a polynomial exponent Linear form lny = 0 + 1 X 1 + 2 X 2 + 3 X 3 + 4 X 4 Y = lny, X 1 = x, X 2 = x 2, X 3 = x 3, X 4 = x 4
Trigonometric Polynomial Models y = 0 + 1 cos(2 f 1 x) + 1 sin(2 f 1 x) + … + k cos(2 f k x) + k sin(2 f k x) Linear form Y = 0 + 1 C 1 + 1 S 1 + … + k C k + k S k Variables Y = y, C 1 = cos(2 f 1 x), S 2 = sin(2 f 1 x), … C k = cos(2 f k x), S k = sin(2 f k x)
Response Surface models Dependent variable Y and two independent variables x 1 and x 2. (These ideas are easily extended to more the two independent variables) The Model (A cubic response surface model) or Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 + 4 X 4 + 5 X 5 + 6 X 6 + 7 X 7 + 8 X 8 + 9 X 9 + where
The Box-Cox Family of Transformations
The Transformation Staircase
The Bulging Rule x up y up y down x down
Non-Linear Models Nonlinearizable models
Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring ) “rate of increase in Y” =
The Logistic Growth Model or (ignoring ) “rate of increase in Y” = Equation:
The Gompertz Growth Model: or (ignoring ) “rate of increase in Y” = Equation:
Example: daily auto accidents in Saskatchewan to 1984 to 1992 Data collected: 1.Date 2.Number of Accidents Factors we want to consider: 1.Trend 2.Yearly Cyclical Effect 3.Day of the week effect 4.Holiday effects
Trend This will be modeled by a Linear function : Y = 0 + 1 X (more generally a polynomial) Y = 0 + 1 X + 2 X 2 + 3 X 3 + …. Yearly Cyclical Trend This will be modeled by a Trig Polynomial – Sin and Cos functions with differing frequencies(periods) : Y = 1 sin(2 f 1 X) + 1 cos(2 f 2 X) 1 sin(2 f 2 X) + 2 cos(2 f 2 X) + …
Day of the week effect: This will be modeled using “dummy”variables : 1 D 1 + 2 D 2 + 3 D 3 + 4 D 4 + 5 D 5 + 6 D 6 D i = (1 if day of week = i, 0 otherwise) Holiday Effects Also will be modeled using “dummy”variables :
Independent variables X = day,D1,D2,D3,D4,D5,D6,S1,S2,S3,S4,S5, S6,C1,C2,C3,C4,C5,C6,NYE,HW,V1,V2,cd,T1, T2. Si=sin( *i*day). Ci=cos( *i*day). Dependent variable Y = daily accident frequency
Independent variables ANALYSIS OF VARIANCE SUM OF SQUARES DF MEAN SQUARE F RATIO REGRESSION RESIDUAL VARIABLES IN EQUATION FOR PACC. VARIABLES NOT IN EQUATION STD. ERROR STD REG F. PARTIAL F VARIABLE COEFFICIENT OF COEFF COEFF TOLERANCE TO REMOVE LEVEL. VARIABLE CORR. TOLERANCE TO ENTER LEVEL (Y-INTERCEPT ). day E E IACC D Dths D S D S D S D C D V S V S cd S T C C C C C NYE HW T ***** F LEVELS( 4.000, 3.900) OR TOLERANCE INSUFFICIENT FOR FURTHER STEPPING
D D D D D D Day of the week effects
NYE HW T Holiday Effects
S S S C C C C C Cyclical Effects
Transformations to Linearity Many non-linear curves can be put into a linear form by appropriate transformations of the either – the dependent variable Y or –some (or all) of the independent variables X 1, X 2,..., X p. This leads to the wide utility of the Linear model. We have seen that through the use of dummy variables, categorical independent variables can be incorporated into a Linear Model. We will now see that through the technique of variable transformation that many examples of non-linear behaviour can also be converted to linear behaviour.
Intrinsically Linear (Linearizable) Curves 1 Hyperbolas y = x/(ax-b) Linear form: 1/y = a -b (1/x) or Y = 0 + 1 X Transformations: Y = 1/y, X=1/x, 0 = a, 1 = -b
2. Exponential y = a e bx = aB x Linear form: ln y = lna + b x = lna + lnB x or Y = 0 + 1 X Transformations: Y = ln y, X = x, 0 = lna, 1 = b = lnB
3. Power Functions y = a x b Linear from: ln y = lna + blnx or Y = 0 + 1 X
Logarithmic Functions y = a + b lnx Linear from: y = a + b lnx or Y = 0 + 1 X Transformations: Y = y, X = ln x, 0 = a, 1 = b
Other special functions y = a e b/x Linear from: ln y = lna + b 1/x or Y = 0 + 1 X Transformations: Y = ln y, X = 1/x, 0 = lna, 1 = b
Polynomial Models y = 0 + 1 x + 2 x 2 + 3 x 3 Linear form Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 Variables Y = y, X 1 = x, X 2 = x 2, X 3 = x 3
Exponential Models with a polynomial exponent Linear form lny = 0 + 1 X 1 + 2 X 2 + 3 X 3 + 4 X 4 Y = lny, X 1 = x, X 2 = x 2, X 3 = x 3, X 4 = x 4
Trigonometric Polynomials
0, 1, 1, …, k, k are parameters that have to be estimated, 1, 2, 3, …, k are known constants (the frequencies in the trig polynomial. Note:
Response Surface models Dependent variable Y and two independent variables x 1 and x 2. (These ideas are easily extended to more the two independent variables) The Model (A cubic response surface model) or Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 + 4 X 4 + 5 X 5 + 6 X 6 + 7 X 7 + 8 X 8 + 9 X 9 + where
The Box-Cox Family of Transformations
The Transformation Staircase
The Bulging Rule x up y up y down x down