Transformations.

Slides:



Advertisements
Similar presentations
Nonlinear models Hill et al Chapter 10. Types of nonlinear models Linear in the parameters. –Includes models that can be made linear by transformation:
Advertisements

Unit 9. Unit 9: Exponential and Logarithmic Functions and Applications.
Correlation and regression
Chapter 12 Simple Linear Regression
Data Modeling and Parameter Estimation Nov 9, 2005 PSCI 702.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Qualitative Variables and
P M V Subbarao Professor Mechanical Engineering Department
Lecture 5 HSPM J716. Assignment 4 Answer checker Go over results.
Lecture 5 HSPM J716. Assignment 4 Answer checker Go over results.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Multiple Regression [ Cross-Sectional Data ]
Multipe and non-linear regression. What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered.
N-way ANOVA. 3-way ANOVA 2 H 0 : The mean respiratory rate is the same for all species H 0 : The mean respiratory rate is the same for all temperatures.
5  ECONOMETRICS CHAPTER Yi = B1 + B2 ln(Xi2) + ui
Logarithmic Equations Unknown Exponents Unknown Number Solving Logarithmic Equations Natural Logarithms.
EGR 105 Foundations of Engineering I Fall 2007 – week 7 Excel part 3 - regression.
Chapter 8 Nonlinear Regression Functions. 2 Nonlinear Regression Functions (SW Chapter 8)
Economics 310 Lecture 7 Testing Linear Restrictions.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Standard Trend Models. Trend Curves Purposes of a Trend Curve: 1. Forecasting the long run 2. Estimating the growth rate.
Empirical Estimation Review EconS 451: Lecture # 8 Describe in general terms what we are attempting to solve with empirical estimation. Understand why.
Transformations. Transformation (re-expression) of a Variable A very useful transformation is the natural log transformation Transformation of a variable.
Classification and Prediction: Regression Analysis
Calibration & Curve Fitting
Transformations to Achieve Linearity
9 - 1 Intrinsically Linear Regression Chapter Introduction In Chapter 7 we discussed some deviations from the assumptions of the regression model.
ECON 1150, 2013 Functions of One Variable ECON 1150, Functions of One Variable Examples: y = 1 + 2x, y = x Let x and y be 2 variables.
Least-Squares Regression
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
3/2003 Rev 1 I – slide 1 of 33 Session I Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.
Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable.
Applications The General Linear Model. Transformations.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
1 G Lect 6M Comparing two coefficients within a regression equation Analysis of sets of variables: partitioning the sums of squares Polynomial curve.
CISE301_Topic41 CISE301: Numerical Methods Topic 4: Least Squares Curve Fitting Lectures 18-19: KFUPM Read Chapter 17 of the textbook.
MECN 3500 Inter - Bayamon Lecture 9 Numerical Methods for Engineering MECN 3500 Professor: Dr. Omar E. Meza Castillo
Transformations. Transformations to Linearity Many non-linear curves can be put into a linear form by appropriate transformations of the either – the.
Multiple Regression I KNNL – Chapter 6. Models with Multiple Predictors Most Practical Problems have more than one potential predictor variable Goal is.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 3: Time Series Regression (Ch. 6) Material.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
4-1 Operations Management Forecasting Chapter 4 - Part 2.
Fitting Curves to Data 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5: Fitting Curves to Data Terry Dielman Applied Regression.
Revision: Pivot Table 1. Histogram 2. Trends 3. Linear 4. Exponential
Nonlinear Models. Agenda Omitted Variables Dummy Variables Nonlinear Models Nonlinear in variables Polynomial Regressions Log Transformed Regressions.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
NATURAL LOGARITHMS. The Constant: e e is a constant very similar to π. Π = … e = … Because it is a fixed number we can find e 2.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Math 4030 – 11b Method of Least Squares. Model: Dependent (response) Variable Independent (control) Variable Random Error Objectives: Find (estimated)
Some Examples. Example: daily auto accidents in Saskatchewan to 1984 to 1992 Data collected: 1.Date 2.Number of Accidents Factors we want to consider:
Curve Fitting Introduction Least-Squares Regression Linear Regression Polynomial Regression Multiple Linear Regression Today’s class Numerical Methods.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1.2 Mathematical Models: A Catalog of Essential Functions.
More on data transformations No recipes, but some advice.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
NATURAL LOGARITHMS LESSON 10 – 3 MATH III. THE NUMBER E e is a mathematical constant found throughout math and science. Bell curve distributions Self-supporting.
Chapter 4 Basic Estimation Techniques
Chapter 7. Classification and Prediction
Lecturer: Ing. Martina Hanová, PhD.
Non-Linear Models Tractable non-linearity Intractable non-linearity
Transformations.
MATH 2140 Numerical Methods
Linear regression Fitting a straight line to observations.
Transformations.
Transformations to Achieve Linearity
Lecturer: Ing. Martina Hanová, PhD.
Nonlinear Fitting.
Multivariate Models Regression.
Presentation transcript:

Transformations

Transformations to Linearity Many non-linear curves can be put into a linear form by appropriate transformations of the either the dependent variable Y or some (or all) of the independent variables X1, X2, ... , Xp . This leads to the wide utility of the Linear model. We have seen that through the use of dummy variables, categorical independent variables can be incorporated into a Linear Model. We will now see that through the technique of variable transformation that many examples of non-linear behaviour can also be converted to linear behaviour.

Intrinsically Linear (Linearizable) Curves 1 Hyperbolas y = x/(ax-b) Linear form: 1/y = a -b (1/x) or Y = b0 + b1 X Transformations: Y = 1/y, X=1/x, b0 = a, b1 = -b

2. Exponential y = a ebx = aBx Linear form: ln y = lna + b x = lna + lnB x or Y = b0 + b1 X Transformations: Y = ln y, X = x, b0 = lna, b1 = b = lnB

3. Power Functions y = a xb Linear from: ln y = lna + blnx or Y = b0 + b1 X

Logarithmic Functions y = a + b lnx Linear from: y = a + b lnx or Y = b0 + b1 X Transformations: Y = y, X = ln x, b0 = a, b1 = b

Other special functions y = a e b/x Linear from: ln y = lna + b 1/x or Y = b0 + b1 X Transformations: Y = ln y, X = 1/x, b0 = lna, b1 = b

Polynomial Models y = b0 + b1x + b2x2 + b3x3 Linear form Y = b0 + b1 X1 + b2 X2 + b3 X3 Variables Y = y, X1 = x , X2 = x2, X3 = x3

Exponential Models with a polynomial exponent Linear form lny = b0 + b1 X1 + b2 X2 + b3 X3+ b4 X4 Y = lny, X1 = x , X2 = x2, X3 = x3, X4 = x4

Trigonometric Polynomials

b0, d1, g1, … , dk, gk are parameters that have to be estimated, n1, n2, n3, … , nk are known constants (the frequencies in the trig polynomial. Note:

Trigonometric Polynomial Models y = b0 + g1cos(2pn1x) + d1sin(2pn1x) + … + gkcos(2pnkx) + dksin(2pnkx) Linear form Y = b0 + g1 C1 + d1 S1 + … + gk Ck + dk Sk Variables Y = y, C1 = cos(2pn1x) , S2 = sin(2pn1x) , … Ck = cos(2pnkx) , Sk = sin(2pnkx)

Response Surface models Dependent variable Y and two independent variables x1 and x2. (These ideas are easily extended to more the two independent variables) The Model (A cubic response surface model) or Y = b0 + b1 X1 + b2 X2 + b3 X3 + b4 X4 + b5 X5 + b6 X6 + b7 X7 + b8 X8 + b9 X9+ e where

The Box-Cox Family of Transformations

The Transformation Staircase

The Bulging Rule x up y up y down x down

Nonlinearizable models Non-Linear Models Nonlinearizable models

Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring e) “rate of increase in Y” =

The Logistic Growth Model Equation: or (ignoring e) “rate of increase in Y” =

The Gompertz Growth Model: Equation: or (ignoring e) “rate of increase in Y” =

Example: daily auto accidents in Saskatchewan to 1984 to 1992 Data collected: Date Number of Accidents Factors we want to consider: Trend Yearly Cyclical Effect Day of the week effect Holiday effects

Trend Yearly Cyclical Trend This will be modeled by a Linear function : Y = b0 +b1 X (more generally a polynomial) Y = b0 +b1 X +b2 X2 + b3 X3 + …. Yearly Cyclical Trend This will be modeled by a Trig Polynomial – Sin and Cos functions with differing frequencies(periods) : Y = d1 sin(2pf1X) + g1 cos(2pf2X) + d1 sin(2pf2X) + g2 cos(2pf2X) + …

Day of the week effect: Holiday Effects This will be modeled using “dummy”variables : a1 D1 + a2 D2 + a3 D3 + a4 D4 + a5 D5 + a6 D6 Di = (1 if day of week = i, 0 otherwise) Holiday Effects Also will be modeled using “dummy”variables :

Independent variables X = day,D1,D2,D3,D4,D5,D6,S1,S2,S3,S4,S5, S6,C1,C2,C3,C4,C5,C6,NYE,HW,V1,V2,cd,T1, T2. Si=sin(0.017202423838959*i*day). Ci=cos(0.017202423838959*i*day). Dependent variable Y = daily accident frequency

Independent variables ANALYSIS OF VARIANCE SUM OF SQUARES DF MEAN SQUARE F RATIO REGRESSION 976292.38 18 54238.46 114.60 RESIDUAL 1547102.1 3269 473.2646   VARIABLES IN EQUATION FOR PACC . VARIABLES NOT IN EQUATION STD. ERROR STD REG F . PARTIAL F VARIABLE COEFFICIENT OF COEFF COEFF TOLERANCE TO REMOVE LEVEL. VARIABLE CORR. TOLERANCE TO ENTER LEVEL (Y-INTERCEPT 60.48909 ) . day 1 0.11107E-02 0.4017E-03 0.038 0.99005 7.64 1 . IACC 7 0.49837 0.78647 1079.91 0 D1 9 4.99945 1.4272 0.063 0.57785 12.27 1 . Dths 8 0.04788 0.93491 7.51 0 D2 10 9.86107 1.4200 0.124 0.58367 48.22 1 . S3 17 -0.02761 0.99511 2.49 1 D3 11 9.43565 1.4195 0.119 0.58311 44.19 1 . S5 19 -0.01625 0.99348 0.86 1 D4 12 13.84377 1.4195 0.175 0.58304 95.11 1 . S6 20 -0.00489 0.99539 0.08 1 D5 13 28.69194 1.4185 0.363 0.58284 409.11 1 . C6 26 -0.02856 0.98788 2.67 1 D6 14 21.63193 1.4202 0.273 0.58352 232.00 1 . V1 29 -0.01331 0.96168 0.58 1 S1 15 -7.89293 0.5413 -0.201 0.98285 212.65 1 . V2 30 -0.02555 0.96088 2.13 1 S2 16 -3.41996 0.5385 -0.087 0.99306 40.34 1 . cd 31 0.00555 0.97172 0.10 1 S4 18 -3.56763 0.5386 -0.091 0.99276 43.88 1 . T1 32 0.00000 0.00000 0.00 1 C1 21 15.40978 0.5384 0.393 0.99279 819.12 1 . C2 22 7.53336 0.5397 0.192 0.98816 194.85 1 . C3 23 -3.67034 0.5399 -0.094 0.98722 46.21 1 . C4 24 -1.40299 0.5392 -0.036 0.98999 6.77 1 . C5 25 -1.36866 0.5393 -0.035 0.98955 6.44 1 . NYE 27 32.46759 7.3664 0.061 0.97171 19.43 1 . HW 28 35.95494 7.3516 0.068 0.97565 23.92 1 . T2 33 -18.38942 7.4039 -0.035 0.96191 6.17 1 . ***** F LEVELS( 4.000, 3.900) OR TOLERANCE INSUFFICIENT FOR FURTHER STEPPING

Day of the week effects D1 4.99945 D2 9.86107 D3 9.43565 D4 13.84377 28.69194 D6 21.63193

Holiday Effects NYE 32.46759 HW 35.95494 T2 -18.38942

Cyclical Effects S1 -7.89293 S2 -3.41996 S4 -3.56763 C1 15.40978 C2 7.53336 C3 -3.67034 C4 -1.40299 C5 -1.36866