Fitting Curves to Data 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5: Fitting Curves to Data Terry Dielman Applied Regression.

Slides:



Advertisements
Similar presentations
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Advertisements

Copyright © 2011 Pearson Education, Inc. Curved Patterns Chapter 20.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 20 Curved Patterns.
Inference for Regression
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Objectives (BPS chapter 24)
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Simple Linear Regression Estimates for single and mean responses.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Problems in Applying the Linear Regression Model Appendix 4A
Chapter 13 Multiple Regression
Checking Assumptions 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 6 Assessing the Assumptions of the Regression Model Terry.
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Chapter 12 Multiple Regression
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
REGRESSION AND CORRELATION
Multiple Linear Regression
Part 18: Regression Modeling 18-1/44 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.
Simple Linear Regression and Correlation
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Descriptive measures of the strength of a linear association r-squared and the (Pearson) correlation coefficient r.
Slide 1 DSCI 5180: Introduction to the Business Decision Process Spring 2013 – Dr. Nick Evangelopoulos Lectures 4-5: Simple Regression Analysis (Ch. 3)
Transforming to Achieve Linearity
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Chapter 11 Simple Regression
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
CHAPTER 14 MULTIPLE REGRESSION
Chapter 12: Linear Regression 1. Introduction Regression analysis and Analysis of variance are the two most widely used statistical procedures. Regression.
Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5 Summarizing Bivariate Data.
Slide 1 DSCI 5180: Introduction to the Business Decision Process Spring 2013 – Dr. Nick Evangelopoulos Lectures 5-6: Simple Regression Analysis (Ch. 3)
Summarizing Bivariate Data
M25- Growth & Transformations 1  Department of ISM, University of Alabama, Lesson Objectives: Recognize exponential growth or decay. Use log(Y.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Copyright ©2011 Nelson Education Limited Linear Regression and Correlation CHAPTER 12.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Simple Linear Regression ANOVA for regression (10.2)
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Non-linear Regression Example.
Time Series Analysis – Chapter 6 Odds and Ends
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Multiple Regression II 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 2) Terry Dielman.
Copyright © 2010 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
David Housman for Math 323 Probability and Statistics Class 05 Ion Sensitive Electrodes.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 15 Multiple Regression Model Building
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
Let’s Get It Straight! Re-expressing Data Curvilinear Regression
Basic Estimation Techniques
Basic Estimation Techniques
Regression Analysis Week 4.
CHAPTER 29: Multiple Regression*
Presentation transcript:

Fitting Curves to Data 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5: Fitting Curves to Data Terry Dielman Applied Regression Analysis: A Second Course in Business and Economic Statistics, fourth edition

Fitting Curves to Data 2 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.1 Introduction In Chapter 4, the model was presented as: where we assumed linear relationships between y and the x variables. In this chapter we find that this may not be true and consider curvilinear relationships between the variables.

Fitting Curves to Data 3 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Modeling  In general, we regress Y on some X which is not a linear function.  Common functions are X 2, 1/X or log(X)  In economics, sometimes regress log(y) on log(x)

Fitting Curves to Data 4 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 Fitting Curvilinear Relationships  Polynomial Regression – a common correction for nonlinearity is to add powers of the explanatory variable  In practice a second-order model is often sufficient to describe the relationship

Fitting Curves to Data 5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example 5.1: Telemarketing n = 20 telemarketing employees Y = average calls per day over 20 workdays X = Months on the job Data set TELEMARKET5

Fitting Curves to Data 6 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Plot of Calls versus Months There is an increase in calls with experience, but the rate of increase slows over time.

Fitting Curves to Data 7 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Fit of a First-Order Model  For comparison purposes, we first fit the linear equation and obtained: CALLS = MONTHS  This equation, which has an R 2 of 87.4%, implies that each month of experience leads to.7435 more calls per day.

Fitting Curves to Data 8 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Fitting a Second-Order Model

Fitting Curves to Data 9 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Regression Output Regression Analysis: CALLS versus MONTHS, MonthSQ The regression equation is CALLS = MONTHS MonthSQ Predictor Coef SE Coef T P Constant MONTHS MonthSQ S = R-Sq = 96.2% R-Sq(adj) = 95.8% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

Fitting Curves to Data 10 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Hypothesis Test on  2 H 0 :  2 = 0 (Use the linear equation) H a :  2 ≠ 0 (Quadratic has improved fit) Test as usual with t = b 2 /SE(b 2 ) Here t = / = is significant with p-value =.000 Not surprising since R 2 increased 9%

Fitting Curves to Data 11 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Hypothesis Tests "Top Down"  The usual practice is to keep lower- order terms when a high-order term is significant.  In b 0 + b 1 x + b 2 x 2 we would retain the b 1 term even if it had an insignificant t-ratio, if the b 2 term was significant.

Fitting Curves to Data 12 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Higher and higher?  To see if the polynomial has even a higher order, we fit a cubic equation.  The table below shows the second- order model was sufficient. Model p for highest order term R2R2R2R2 Adj R 2 SeSeSeSe Linear %86.7%1.787 Quadratic %95.8%1.003 Cubic %95.7%1.020

Fitting Curves to Data 13 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Centering the X  When polynomial regression is used, multicollinearity often results because x and x 2 are correlated.  This can be eliminated by subtracting x-bar (the mean) from each x

Fitting Curves to Data 14 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Reciprocal Transformation of the x Variable  Another curvilinear relationship that is in common use is:  Here y and x are inversely related but the relationship is not linear.

Fitting Curves to Data 15 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example 5.2  We are interested in the relationship between gas mileage and a car's horsepower.  An the next page is a plot of the highway mpg (HWYMPG) and horsepower (HP) for 147 cars listed in the October 2002 Road and Track.

Fitting Curves to Data 16 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Highway MPG versus Horsepower

Fitting Curves to Data 17 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Modeling the Relationship  A regression of HWYMPG on HP yields HWYMPG = HP with R 2 = 59.4%  This does not fit too well because as horsepower increases, mileage decreases, but the rate of decrease is slower for more-powerful cars.  Although other models, including a quadratic, might work, we regressed HWYMPG on 1/HP.

Fitting Curves to Data 18 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Regression Results The regression equation is HWYMPG = HPINV Predictor Coef SE Coef T P Constant HPINV S = R-Sq = 80.0% R-Sq(adj) = 79.9% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

Fitting Curves to Data 19 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Data and Reciprocal Fit

Fitting Curves to Data 20 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Log Transformation of the x Variable  Yet another curvilinear equation is: where ln(x) is the natural logarithm of x.  It is assumed that the x values are positive because ln(0) is undefined.

Fitting Curves to Data 21 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example 5.4 Fuel Consumption n = 51 (50 states plus Washington, D.C.) FUELCON = fuel consumption per capita POP = state population AREA = area of state in square miles POPDENS = population density Data Set FUELCON5

Fitting Curves to Data 22 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Plot of Fuelcon versus Density r = -.454

Fitting Curves to Data 23 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Effect of the Transformation  The graph has one point (D.C.) on the right with all others clumped to the left.  It is hard to see what type of relationship there is until some adjustments are made.  Here take the natural log of density to "pull" the extreme point back in.

Fitting Curves to Data 24 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Consumption versus Logdensity r = -.527

Fitting Curves to Data 25 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Linear and Log Regressions The regression equation is FUELCON = DENSITY Predictor Coef SE Coef T P Constant DENSITY S = R-Sq = 20.6% R-Sq(adj) = 19.0% The regression equation is FUELCON = 597 – 24.5 LOGDENS Predictor Coef SE Coef T P Constant LOGDENS S = R-Sq = 27.8% R-Sq(adj) = 26.3%

Fitting Curves to Data 26 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Log Transformations of Both the y and x Variables  Here the natural log of y is the dependent variable and the natural log of x is the independent variable:  Comparing results with other models may be difficult since we are not modeling y itself.  Economists sometimes use this to estimate price elasticity (y is demand and x price; b 1 is estimated elasticity).

Fitting Curves to Data 27 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Example 5.4 Imports and GDP The gross domestic product (GDP) and dollar amount of total imports (IMPORTS) for 25 countries was obtained from the World Fact Book. For both variables, low values clump together and higher values spread out, suggesting log transformations for both.

Fitting Curves to Data 28 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Scatterplot of Imports vs GDP

Fitting Curves to Data 29 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Scatterplot of LogImp vs LogGDP

Fitting Curves to Data 30 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Two Regression Models Two Regression Models Regression Analysis: IMPORTS versus GDP Predictor Coef SE Coef T P Constant GDP S = R-Sq = 87.2% R-Sq(adj) = 86.6% Regression Analysis: LogImp versus LogGDP Predictor Coef SE Coef T P Constant LogGDP S = R-Sq = 84.0% R-Sq(adj) = 83.4% Not directly comparable

Fitting Curves to Data 31 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. The R 2 Compare Different Things  The 87.2 % R 2 for the no-log model is the percentage of variation in Imports explained.  The 84.0% for the second model is the percentage of variation in ln(Imports) explained.  If you converted the fitted values of the second model back to Imports you might find the log model better.

Fitting Curves to Data 32 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. What Transformation to Use  It is probably best to try several.  A quadratic is most flexible because it uses two parameters to fit the relationship between to fit the relationship between y and x.  Some further analysis is in Chapter 6 where tests for nonlinearity are discussed.

Fitting Curves to Data 33 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Fitting Curved Trends If the data is collected over time, we may want to consider variations on the linear trend model of Chapter 3. Another is the S-Curve trend:

Fitting Curves to Data 34 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. S Curve Model Many products have a demand curve like this. 1.Initial demand increases slowly 2.As product matures, demand picks up and steadily grows. 3.At some saturation point demand levels off.

Fitting Curves to Data 35 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Exponential Growth Model Another alternative is an exponential trend: This can be fit by least squares if you model ln(y).