Download presentation
Presentation is loading. Please wait.
Published byGerald Robbins Modified over 9 years ago
1
FORECASTING WITH REGRESSION MODELS TREND ANALYSIS BUSINESS FORECASTING Prof. Dr. Burç Ülengin ITU MANAGEMENT ENGINEERING FACULTY FALL 2011
2
OVERVIEW The bivarite regression model Data inspection Regression forecast process Forecasting with simple linear trend Causal regression model Statistical evaluation of regression model Examples...
3
The Bivariate Regression Model The bivariate regression model is also known a simple regression model It is a statistical tool that estimates the relationship between a dependent variable(Y) and a single independent variable(X). The dependent variable is a variable which we want to forecast
4
The Bivariate Regression Model General form Dependent variable Independent variable Specific form: Linear Regression Model Random disturbance
5
The Bivariate Regression Model The regression model is indeed a line equation 1 = slope coefficient that tell us the rate of change in Y per unit change in X If 1 = 5, it means that one unit increase in X causes 5 unit increase in Y is random disturbance, which causes for given X, Y can take different values Objective is to estimate 0 and 1 such a way that the fitted values should be as close as possible
6
The Bivariate Regression Model Geometrical Representation X Y Poor fit Good fit The red line is more close the data points than the blue one
7
Best Fit Estimates population sample
8
Best Fit Estimates-OLS
9
Misleading Best Fits X Y X Y X Y X Y e 2 = 100
10
THE CLASSICAL ASSUMPTIONS 1.The regression model is linear in the coefficients, correctly specified, & has an additive error term. 2.E( ) = 0. 3.All explanatory variables are uncorrelated with the error term. 4.Errors corresponding to different observations are uncorrelated with each other. 5.The error term has a constant variance. 6.No explanatory variable is an exact linear function of any other explanatory variable(s). 7.The error term is normally distributed such that:
11
Regression Forecasting Process Data consideration: plot the graph of each variable over time and scatter plot. Look at Trend Seasonal fluctuation Outliers To forecast Y we need the forecasted value of X Reserve a holdout period for evaluation and test the estimated equation in the holdout period
12
An Example: Retail Car Sales The main explanatory variables: Income Price of a car Interest rates- credit usage General price level Population Car park-number of cars sold up to time-replacement purchases Expectation about future For simple-bivariate regression, income is chosen as an explanatory variable
13
Bi-variate Regression Model Population regression model Our expectation is 1 >0 But, we have no all available data at hand, the data set only covers the 1990s. We have to estimate model over the sample period Sample regression model is
14
Retail Car Sales and Disposable Personal Income Figures Quarterly car sales 000 cars Disposable income $
15
OLS Estimate Dependent Variable: RCS Method: Least Squares Sample: 1990:1 1998:4 Included observations: 36 VariableCoefficientStd. Errort-StatisticProb. C541010.9746347.90.7248780.4735 DPI62.3942840.007931.5595480.1281 R-squared0.066759 Mean dependent var1704222. Adjusted R-squared0.039311 S.D. dependent var164399.9 S.E. of regression161136.1 Akaike info criterion26.87184 Sum squared resid8.83E+11 Schwarz criterion26.95981 Log likelihood-481.6931 F-statistic2.432189 Durbin-Watson stat1.596908 Prob(F-statistic)0.128128
16
Basic Statistical Evaluation 1 is the slope coefficient that tell us the rate of change in Y per unit change in X When the DPI increases one $, the number of cars sold increases 62. Hypothesis test related with 1 H 0 : 1 =0 H 1 : 1 0 t test is used to test the validity of H 0 t = 1 /se( 1 ) If t statistic > t table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If t statistic Do not reject H 0 t= 1,56 0.05 Do not Reject H 0 DPI has no effect on RCS
17
Basic Statistical Evaluation R 2 is the coefficient of determination that tells us the fraction of the variation in Y explained by X 0<R 2 <1, R 2 = 0 indicates no explanatory power of X-the equation. R 2 = 1 indicates perfect explanation of Y by X-the equation. R 2 = 0.066 indicates very weak explanation power Hypothesis test related with R 2 H 0 : R 2 =0 H 1 : R 2 0 F test check the hypothesis If F statistic > F table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If F statistic Do not reject H 0 F-statistic=2.43 0.05 Do not reject H 0 Estimated equation has no power to explain RCS figures
18
Graphical Evaluation of Fit and Error Terms Residuls show clear seasonal pattern
19
Model Improvement When we look the graph of the series, the RCS exhibits clear seasonal fluctuations, but PDI does not. Remove seasonality using seasonal adjustment method. Then, use seasonally adjusted RCS as a dependent variable.
20
Seasonal Adjustment Sample: 1990:1 1998:4 Included observations: 36 Ratio to Moving Average Original Series: RCS Adjusted Series: RCSSA Scaling Factors: 1 0.941503 2 1.119916 3 1.016419 4 0.933083
21
Seasonally Adjusted RCS and RCS
22
OLS Estimate Dependent Variable: RCSSA Method: Least Squares Sample: 1990:1 1998:4 Included observations: 36 VariableCoefficientStd. Errort-StatisticProb. C481394.3464812.81.0356740.3077 DPI65.3655924.916262.6234110.0129 R-squared0.168344 Mean dependent var1700000. Adjusted R-squared0.143883 S.D. dependent var108458.4 S.E. of regression100352.8 Akaike info criterion25.92472 Sum squared resid3.42E+11 Schwarz criterion26.01270 Log likelihood-464.6450 F-statistic6.882286 Durbin-Watson stat0.693102 Prob(F-statistic)0.012939
23
Basic Statistical Evaluation 1 is the slope coefficient that tell us the rate of change in Y per unit change in X When the DPI increases one $, the number of cars sold increases 65. Hypothesis test related with 1 H 0 : 1 =0 H 1 : 1 0 t test is used to test the validity of H 0 t = 1 /se( 1 ) If t statistic > t table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If t statistic Do not reject H 0 t= 2,62 < t table or Pr = 0.012 < 0.05 Reject H 0 DPI has statistically significant effect on RCS
24
Basic Statistical Evaluation R 2 is the coefficient of determination that tells us the fraction of the variation in Y explained by X 0<R 2 <1, R 2 = 0 indicates no explanatory power of X-the equation. R 2 = 1 indicates perfect explanation of Y by X-the equation. R 2 = 0.1683 indicates very weak explanation power Hypothesis test related with R 2 H 0 : R 2 =0 H 1 : R 2 0 F test check the hypothesis If F statistic > F table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If F statistic Do not reject H 0 F-statistic = 6.88 < F table or Pr = 0.012 < 0.05 Reject H 0 Estimated equation has some power to explain RCS figures
25
Graphical Evaluation of Fit and Error Terms No seasonality but it still does not look random disturbance Omitted Variable? Business Cycle?
26
Trend Models
27
Simple Regression Model Special Case: Trend Model Independent variable Time, t = 1, 2, 3,...., T-1, T There is no need to forecast the independent variable Using simple transformations, variety of nonlinear trend equations can be estimated, therefore the estimated model can mimic the pattern of the data
28
Suitable Data Pattern NO SEASONALITY ADDITIVE SEASONALITY MULTIPLICTIVE SEASONALITY NO TREND ADDITIVE TREND MULTIPLICATIVE TREND
29
Chapter 3 Exercise 13 College Tuition Consumers' Price Index by Quarter Holdout period
30
OLS Estimates Dependent Variable: FEE Method: Least Squares Sample: 1986:1 1994:4 Included observations: 36 VariableCoefficientStd. Errort-StatisticProb. C115.73121.98216658.386240.0000 @TREND3.8375800.09739939.400800.0000 R-squared0.978568 Mean dependent var182.8889 Adjusted R-squared0.977938 S.D. dependent var40.87177 S.E. of regression6.070829 Akaike info criterion6.498820 Sum squared resid1253.069 Schwarz criterion6.586793 Log likelihood -114.9788 F-statistic1552.423 Durbin-Watson stat0.284362 Prob(F-statistic)0.000000 e2e2
31
Basic Statistical Evaluation 1 is the slope coefficient that tell us the rate of change in Y per unit change in X Each year tuition increases 3.83 points. Hypothesis test related with 1 H 0 : 1 =0 H 1 : 1 0 t test is used to test the validity of H 0 t = 1 /se( 1 ) If t statistic > t table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If t statistic Do not reject H 0 t= 39,4 > t table or Pr = 0.0000 < 0.05 Reject H 0
32
Basic Statistical Evaluation R 2 is the coefficient of determination that tells us the fraction of the variation in Y explained by X 0<R 2 <1, R 2 = 0 indicates no explanatory power of X-the equation. R 2 = 1 indicates perfect explanation of Y by X-the equation. R 2 = 0.9785 indicates very weak explanation power Hypothesis test related with R 2 H 0 : R 2 =0 H 1 : R 2 0 F test check the hypothesis If F statistic > F table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If F statistic Do not reject H 0 F-statistic= 1552 < F table or Pr = 0.0000 < 0.05 Reject H 0 Estimated equation has explanatory power
33
Graphical Evaluation of Fit Holdout period ACTUAL FORECAST 1995 Q1 260.00 253.88 1995 Q2 259.00 257.72 1995 Q3 266.00 261.55 1995 Q4 274.00 265.39
34
Graphical Evaluation of Fit and Error Terms Residuals exhibit clear pattern, they are not random Also the seasonal fluctuations can not be modelled Regression model is misspecified
35
Model Improvement Data may exhibit exponential trend In this case, take the logarithm of the dependent variable Calculate the trend by OLS After OLS estimation forecast the holdout period Take exponential of the logarithmic forecasted values in order to reach original units
36
Suitable Data Pattern NO SEASONALITY ADDITIVE SEASONALITY MULTIPLICTIVE SEASONALITY NO TREND ADDITIVE TREND MULTIPLICATIVE TREND
37
Original and Logarithmic Transformed Data LOG(FEE) FEE 4.844187 127.000 4.867534 130.000 4.912655 136.000 4.919981 137.000 4.941642 140.000 4.976734 145.000 4.983607 146.000
38
OLS Estimate of the Logrithmin Trend Model Dependent Variable: LFEE Method: Least Squares Sample: 1986:1 1994:4 Included observations: 36 VariableCoefficientStd. Errort-StatisticProb. C4.8167080.005806829.56350.0000 @TREND0.0210340.00028573.722770.0000 R-squared0.993783 Mean dependent var5.184797 Adjusted R-squared0.993600 S.D. dependent var0.222295 S.E. of regression0.017783 Akaike info criterion-5.167178 Sum squared resid0.010752 Schwarz criterion-5.079205 Log likelihood95.00921 F-statistic5435.047 Durbin-Watson stat0.893477 Prob(F-statistic)0.000000
39
Forecast Calculations obs FEE LFEEF FEELF=exp(LFEEF) 1993:1 228.0000 5.405651 222.6610 1993:2 228.0000 5.426684 227.3940 1993:3 235.0000 5.447718 232.2276 1993:4 243.0000 5.468751 237.1639 1994:1 244.0000 5.489785 242.2052 1994:2 245.0000 5.510819 247.3536 1994:3 251.0000 5.531852 252.6114 1994:4 259.0000 5.552886 257.9810 1995:1 260.0000 5.573920 263.4648 1995:2 259.0000 5.594953 269.0651 1995:3 266.0000 5.615987 274.7845 1995:4 274.0000 5.637021 280.6254
40
Graphical Evaluation of Fit and Error Terms Residuals exhibit clear pattern, they are not random Also the seasonal fluctuations can not be modelled Regression model is misspecified
41
Model Improvement In order to deal with seasonal variations remove seasonal pattern from the data Fit regression model to seasonally adjusted data Generate forecasts Add seasonal movements to the forecasted values
42
Suitable Data Pattern NO SEASONALITY ADDITIVE SEASONALITY MULTIPLICTIVE SEASONALITY NO TREND ADDITIVE TREND MULTIPLICATIVE TREND
43
Multiplicative Seasonal Adjustment Included observations: 40 Ratio to Moving Average Original Series: FEE Adjusted Series: FEESA Scaling Factors: 1 1.002372 2 0.985197 3 0.996746 4 1.015929
44
Original and Seasonally Adjusted Data
45
OLS Estimate of the Seasonally Adjusted Trend Model Dependent Variable: FEESA Method: Least Squares Sample: 1986:1 1995:4 Included observations: 40 VariableCoefficientStd. Errort-StatisticProb. C115.03871.72763266.587490.0000 @TREND3.8974880.07624051.121520.0000 R-squared0.985668 Mean dependent var191.0397 Adjusted R-squared0.985291 S.D. dependent var45.89346 S.E. of regression5.566018 Akaike info criterion6.319943 Sum squared resid1177.261 Schwarz criterion6.404387 Log likelihood-124.3989 F-statistic2613.410 Durbin-Watson stat0.055041 Prob(F-statistic)0.000000
46
Graphical Evaluation of Fit and Error Terms Residuals exhibit clear pattern, they are not random There is no seasonal fluctuations Regression model is misspecified
47
Model Improvement Take the logarithm in order to remove existing nonlinearity Use additive seasonal adjustment to logarithmic data Apply OLS to seasonally adjusted logrithmic data Forecast holdout period Add seasonal movements to reach seasonal forecasts Take an exponential in order to reach original seasonal forecasts
48
Suitable Data Pattern NO SEASONALITY ADDITIVE SEASONALITY MULTIPLICTIVE SEASONALITY NO TREND ADDITIVE TREND MULTIPLICATIVE TREND
49
Logarithmic Transformation and Additive Seasonal Adjustment Sample: 1986:1 1995:4 Included observations: 40 Difference from Moving Average Original Series: LFEE=log(FEE) Adjusted Series: LFEESA Scaling Factors: 1 0.002216 2-0.014944 3-0.003099 4 0.015828
50
Original and Logarithmic Additive Seasonally Adjustment Series
51
OLS Estimate of the Logarithmic Additive Seasonally Adjustment Data Dependent Variable: LFEESA Method: Least Squares Sample: 1986:1 1995:4 Included observations: 40 VariableCoefficientStd. Errort-StatisticProb. C4.8221220.0047611012.7790.0000 @TREND0.0206180.00021098.127600.0000 R-squared0.996069 Mean dependent var5.224171 Adjusted R-squared0.995966 S.D. dependent var0.241508 S.E. of regression0.015340 Akaike info criterion-5.468039 Sum squared resid0.008942 Schwarz criterion-5.383595 Log likelihood111.3608 F-statistic9629.026 Durbin-Watson stat0.149558 Prob(F-statistic)0.000000
52
Graphical Evaluation of Fit and Error Terms Residuals exhibit clear pattern, they are not random There is no seasonal fluctuations Regression model is misspecified
53
Autoregressive Model Some cases the growth model may be more suitable to the data If data exhibits the nonlinearity, the autoregressive model can be adjusted to model exponential pattern
54
OLS Estimate of Autoregressive Model Dependent Variable: FEE Method: Least Squares Sample(adjusted): 1986:2 1995:4 Included observations: 39 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C0.7394902.3056540.3207290.7502 FEE(-1)1.0160350.01188485.497180.0000 R-squared0.994964 Mean dependent var192.7179 Adjusted R-squared0.994828 S.D. dependent var45.45787 S.E. of regression3.269285 Akaike info criterion5.256940 Sum squared resid395.4643 Schwarz criterion5.342251 Log likelihood-100.5103 F-statistic7309.767 Durbin-Watson stat1.888939 Prob(F-statistic)0.000000
55
Graphical Evaluation of Fit and Error Terms Clear seasonal pattern Model is misspecified
56
Model Improvement To remove seasonal fluctuations Seasonally adjust the data Apply OLS to Autoregressive Trend Model Forecast seasonally adjusted data Add seasonal movement to forecasted values
57
Dependent Variable: FEESA Method: Least Squares Sample(adjusted): 1986:2 1995:4 Included observations: 39 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C1.1253150.8114811.3867430.1738 FEESA(-1)1.0134450.004181242.40270.0000 R-squared0.999371 Mean dependent var192.6894 Adjusted R-squared0.999354 S.D. dependent var45.27587 S.E. of regression1.151024 Akaike info criterion3.169101 Sum squared resid49.01968 Schwarz criterion3.254412 Log likelihood-59.79748 F-statistic58759.08 Durbin-Watson stat1.335932 Prob(F-statistic)0.000000 OLS Estimate of Seasonally Adjusted Autoregressive Model
58
Graphical Evaluation of Fit and Error Terms No seasonal pattern in the residuals Model specification seems more corret than the previous estimates
59
Seasonal Autoregressive Model If data exhibits sesonal fluctutions, the growth model should be remodeled If data exhibits the nonlinearity and sesonality together, the seasonal autoregressive model can be adjusted to model exponential pattern
60
New Product Forecasting Growth Curve Fitting For new products, the main problem is typically lack of historical data. Trend or Seasonal pattern can not be determined. Forecasters can use a number of models that generally fall in the category called Diffusion Models. These models are alternatively called S-curves, growth models, saturation models, or substitution curves. These models imitate life cycle of poducts. Life cycles follows a common pattern: A period of slow growth just after introduction of new product A period of rapid growth Slowing growth in a mature phase Decline
61
New Product Forecasting Growth Curve Fitting Growth models has its own lower and upper limit. A significant benefit of using diffusion models is to identfy and predict the timing of the four phases of the life cycle. The usual reason for the transition from very slow initial growth to rapid growth is often the result of solutions to technical difficulties and the market’s acceptance of the new product / technology. There are uper limits and a maturity phase occurs in which growth slows and finally ceases.
62
GOMPERTZ CURVE Gompertz function is given as where L = Upper limit of Y e = Natural number = 2.718262..... a and b = coefficients describing the curve The Gompertz curve will range in value from zero to L as t varies from zero to infinity. Gompertz curve is a way to summarize the growth with a few parameters.
63
GOMPERTZ CURVE An Example HDTV: LCD and Plazma TV sales figures YEAR HDTV 2000 1200 2001 1500 2002 1770 2003 3350 2004 5500 2005 9700 2006 15000
64
GOMPERTZ CURVE An Example
65
LOGISTICS CURVE Logistic function is given as where L = Upper limit of Y e = Natural number = 2.718262..... a and b = coefficients describing the curve The Logistic curve will range in value from zero to L as t varies from zero to infinity. The Logistic curve is symetric about its point of inflection. The Gompertz curve is not necessarily symmetric.
66
LOGISTICS or GOMPERTZ CURVES ? The answer lies in whether, in a particular situation, it is easier to achieve the maximum value the closer you get to it, or whether it becomes more difficult to attain the maximum value the closer you get to it. Are there factors assisting the attainment of the maximum value once you get close to it, or Are there factors preventing the attainment of the maximum value once it is nearly attained? If there is an offsetting factor such that growth is more difficult to maintain as the maximum is approached, then the Gompertz curve will be the best choice. If there are no such offsetting factors hindering than attainment of the maximum value, the logistics curve will be the best choice.
67
LOGISTICS CURVE An Example HDTV: LCD and Plazma TV sales figures YEAR HDTV 2000 1200 2001 1500 2002 1770 2003 3350 2004 5500 2005 9700 2006 15000
68
LOGISTICS versus GOMPERTZ CURVES
69
FORECASTING WITH MULTIPLE REGRESSION MODELS BUSINESS FORECASTING
70
CONTENT DEFINITION INDEPENDENT VARIABLE SELECTION,FORECASTING WITH MULTIPLE REGRESSION MODEL STATISTICAL EVALUATION OF THE MODEL SERIAL CORRELATION SEASONALITY TREATMENT GENERAL AUTOREGRESSIVE MODEL ADVICES EXAMPLES....
71
MULTIPLE REGRESSION MODEL DEPENDENT VARIABLE, Y, IS A FUNCTION OF MORE THAN ONE INDEPENDENT VARIABLE, X 1, X 2,..X k
72
SELECTING INDEPENDENT VARIABLES FIRST, DETERMINE DEPENDENT VARIABLE SEARCH LITERATURE, USE COMMONSENSE AND LIST THE MAIN POTENTIAL EXPLANATORY VARIABLES IF TWO VARIABLE SHARE THE SAME INFORMATION SUCH AS GDP AND GNP SELECT THE MOST RELEVANT ONE IF A VARITION OF A VARIABLE IS VERY LITTLE, FIND OUT MORE VARIABLE ONE SET THE EXPECTED SIGNS OF THE PARAMETERS TO BE ESTIMATED
73
AN EXAMPLE: SELECTING INDEPENDENT VARIABLES LIQUID PETROLIUM GAS-LPG- MARKET SIZE FORECAST POTENTIAL EXPLANATORY VARIABLES POPULATION PRICE URBANIZATION RATIO GNP or GDP EXPECTATIONS
74
PARAMETER ESTIMATES-OLS ESTIMATION IT IS VERY COMPLEX TO CALCULATE b’s, MATRIX ALGEBRA IS USED TO ESTIMATE b’s.
75
FORECASTING WITH MULTIPLE REGRESSION MODEL Ln(SALES t ) = 23 + 1.24*Ln(GDP t ) - 0.90*Ln(PRICE t ) IF GDP INCREASES 1%, SALES INCRESES 1.24% IF PRICE INCREASES 1% SALES DECRAESES 0.9% PERIODGDPPRICESALES 1001245 100 230 1011300 103 ? Ln(SALES t ) = 23 + 1.24*Ln(1300) - 0.90*Ln(103) Ln(SALES t ) = 3.63 e 3.63 = 235
76
EXAMPLE : LPG FORECAST
77
LOGARITHMIC TRANSFORMATION
78
SCATTER DIAGRAM UNEXPECTED RELATION
79
LSATA=f(LGNP) Dependent Variable: LSATA Method: Least Squares Sample: 1968 1997 Included observations: 30 VariableCoefficientStd. Errort-StatisticProb. C-44.911503.097045-14.501400.0000 LGNP4.0819380.22026518.531950.0000 R-squared0.924616 Mean dependent var12.47858 Adjusted R-squared0.921924 S.D. dependent var0.736099 S.E. of regression0.205681 Akaike info criterion-0.260637 Sum squared resid1.184535 Schwarz criterion-0.167224 Log likelihood5.909555 F-statistic343.4333 Durbin-Watson stat0.485414 Prob(F-statistic)0.000000
80
Graphical Evaluation of Fit and Error Terms NOT RANDOM
81
LSATA=f(LP) Dependent Variable: LSATA Method: Least Squares Sample(adjusted): 1969 1997 Included observations: 29 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C11.707260.081886142.96940.0000 LP0.1901280.01509612.594920.0000 R-squared0.854551 Mean dependent var12.53724 Adjusted R-squared0.849164 S.D. dependent var0.674006 S.E. of regression0.261768 Akaike info criterion0.223756 Sum squared resid1.850107 Schwarz criterion0.318052 Log likelihood-1.244459 F-statistic158.6319 Durbin-Watson stat0.187322 Prob(F-statistic)0.000000
82
Graphical Evaluation of Fit and Error Terms NOT RANDOM
83
LSATA=f(LGNP,LP) Dependent Variable: LSATA Method: Least Squares Sample(adjusted): 1969 1997 Included observations: 29 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C-30.8084107.715902-3.9928460.0005 LGNP 3.0666550.5565335.5102840.0000 LP 0.0453180.0282811.6024360.1211 R-squared0.932905 Mean dependent var12.53724 Adjusted R-squared0.927744 S.D. dependent var0.674006 S.E. of regression0.181176 Akaike info criterion-0.480999 Sum squared resid0.853443 Schwarz criterion-0.339555 Log likelihood9.974488 F-statistic180.7558 Durbin-Watson stat0.364799 Prob(F-statistic)0.000000
84
Graphical Evaluation of Fit and Error Terms NOT RANDOM
85
WHAT IS MISSING? GNP AND PRICE ARE THE MOST IMPORTANT VARIABLES BUT THE COEFFICIENT OF THE PRICE IS NOT SIGNIFICANT AND HAS UNEXPECTED SIGN RESIDUAL DISTRIBUTION IS NOT RANDOM WHAT IS MISSING? WRONG FUNCTION-NONLINEAR MODEL? LACK OF DYNAMIC MODELLING? MISSING IMPORTANT VARIABLE? POPULATION?
86
Dependent Variable: LSATA Method: Least Squares Sample(adjusted): 1969 1997 Included observations: 29 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C-50.9134203.992134-12.753430.0000 LGNP 0.7554450.3378942.2357460.0345 LP -0.1315080.021528-6.1085680.0000 LPOP 4.9559450.48688710.178850.0000 R-squared0.986958 Mean dependent var2.53724 Adjusted R-squared0.985393 S.D. dependent var0.674006 S.E. of regression0.081461 Akaike info criterion-2.049934 Sum squared resid0.165899 Schwarz criterion-1.861342 Log likelihood33.72405 F-statistic630.6084 Durbin-Watson stat0.398661 Prob(F-statistic)0.000000 LSATA=f(LGNP,LP,LPOP)
87
Graphical Evaluation of Fit and Error Terms NOT RANDOM
88
WHAT IS MISSING? GNP, POPULATION AND PRICE ARE THE MOST IMPORTANT VARIABLES. THEY ARE SIGNIFICANT THEY HAVE EXPECTED SIGN RESIDUAL DISTRIBUTION IS NOT RANDOM WHAT IS MISSING? WRONG FUNCTION-NONLINEAR MODEL? LACK OF DYNAMIC MODELLING? YES. MISSING IMPORTANT VARIABLE? YES, URBANIZATION
89
Dependent Variable: LSATA Method: Least Squares Sample(adjusted): 1969 1997 Included observations: 29 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C-16.1859103.832897-4.2228930.0003 LGNP 0.5236570.1509713.4685850.0020 LP -0.0339640.013483-2.5189340.0188 LPOP 1.2797530.4195663.0501820.0055 LSATA(-1) 0.6199860.06075610.204460.0000 R-squared0.997557 Mean dependent var2.53724 Adjusted R-squared0.997150 S.D. dependent var0.674006 S.E. of regression0.035983 Akaike info criterion-3.655968 Sum squared resid0.031074 Schwarz criterion-3.420227 Log likelihood58.01154 F-statistic2450.048 Durbin-Watson stat2.118752 Prob(F-statistic)0.000000 LSATA=f(LGNP,LP,LPOP,LSATA t-1 )
90
Graphical Evaluation of Fit and Error Terms RANDOM
91
Basic Statistical Evaluation 1 is the slope coefficient that tell us the rate of change in Y per unit change in X When the GNP increases 1%, the volume of LPG sales increases 0.52%. Hypothesis test related with 1 H 0 : 1 =0 H 1 : 1 0 t test is used to test the validity of H 0 t = 1 /se( 1 ) If t statistic > t table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If t statistic Do not reject H 0 t= 3,46 < t table or Pr = 0.002 < 0.05 Reject H 0 GNP has effect on RCS
92
Basic Statistical Evaluation R 2 is the coefficient of determination that tells us the fraction of the variation in Y explained by X 0<R 2 <1, R 2 = 0 indicates no explanatory power of X-the equation. R 2 = 1 indicates perfect explanation of Y by X-the equation. R 2 = 0.9975 indicates very strong explanation power Hypothesis test related with R 2 H 0 : R 2 =0 H 1 : R 2 0 F test check the hypothesis If F statistic > F table Reject H 0 or Pr < (exp. =0.05) Reject H 0 If F statistic Do not reject H 0 F-statistic=2450 < F table or Pr = 0.0000 < 0.05 Reject H 0 Estimated equation has power to explain RCS figures
93
SHORT AND LONG TERM IMPACTS If we specify a dynamic model, we can estimate short and a long term impact of independent variables simultaneously on the dependent variable Short term effect of x Long term effect of x
94
AN EXAMPLE: SHORT AND LONG TERM IMPACTS Short Term ImpactLong Term Impact LGNP 0.523657 1.3778 LP -0.033964 -0.0892 LPOP 1.279753 3.3657 If GNP INCREASES 1% AT TIME t, THE LPG SALES INCREASES 0.52% AT TIME t IN THE LONG RUN, WITHIN 3-5 YEARS, THE LPG SALES INCREASES 1.38%
95
SESONALITY AND MULTIPLE REGRESSION MODEL SEASONAL DUMMY VARIABLES CAN BE USED TO MODEL SEASONAL PATTERNS DUMMY VARIABLE IS A BINARY VARIABLE THAT ONLY TAKES THE VALUES 0 AND 1. DUMMY VARIABLES RE THE INDICATOR VARIABLES, IF THE DUMMY VARIABLE TAKES 1 IN A GIVEN TIME, IT MEANS THAT SOMETHING HAPPENS IN THAT PERIOD.
96
SEASONAL DUMMY VARIABLES THE SOMETHING CAN BE SPECIFIC SEASON THE DUMMY VARIABLE INDICATES THE SPECIFIC SEASON D1 IS A DUMMY VARIABLE WHICH INDICATES THE FIRST QUARTERS »1990Q11 »1990Q20 »1990Q30 »1990Q40 »1991Q11 »1991Q20 »1991Q30 »1991Q40 »1992Q11 »1992Q20 »1992Q30 »1992Q40
97
BASE PERIOD DATE D1D2D3 1990 Q1 1 0 0 1990 Q2 0 1 0 1990 Q3 0 0 1 1990 Q4 0 0 0 1990 Q1 1 0 0 1991 Q2 0 1 0 1991 Q3 0 0 1 1991 Q4 0 0 0 1992 Q1 1 0 0 1992 Q2 0 1 0 1992 Q3 0 0 1 1992 Q4 0 0 0 FULL SEASONAL DUMMY VARIABLE REPRESANTATION
98
COLLEGE TUITION CONSUMERS' PRICE INDEX BY QUARTER
99
QUARTERLY DATA THEREFORE 3 DUMMY VARIABLES WILL BE SUFFICIENT TO CAPTURE THE SEASONAL PATTERN DATE D1D2D3 1990 Q1 1 0 0 1990 Q2 0 1 0 1990 Q3 0 0 1 1990 Q4 0 0 0
100
SEASONAL PATTERN MODELLED COLLEGE TUITION PRICE INDEX TREND ESTIMATION Dependent Variable: LOG(FEE) Method: Least Squares Sample(adjusted): 1986:3 1995:4 Included observations: 38 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C 4.8323350.006948695.47710.0000 @TREND 0.0207800.00023289.571050.0000 D1-0.0112590.007202-1.5633440.1275 D1(-1)-0.0295260.007198-4.1019480.0003 D1(-2)-0.0170820.007010-2.4368060.0204 R-squared0.995921 Mean dependent var5.244170 Adjusted R-squared0.995427 S.D. dependent var0.231661 S.E. of regression0.015666 Akaike info criterion-5.352558 Sum squared resid0.008099 Schwarz criterion-5.137087 Log likelihood106.6986 F-statistic2014.429 Durbin-Watson stat0.161634 Prob(F-statistic)0.000000
101
Graphical Evaluation of Fit and Error Terms NOT RANDOM
102
COLLEGE TUITION PRICE INDEX AUTOREGRESSIVE TREND ESTIMATION Dependent Variable: LOG(FEE) Method: Least Squares Sample(adjusted): 1986:3 1995:4 Included observations: 38 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C0.0508870.0229692.2155240.0337 LOG(FEE(-1))0.9975100.004375227.99580.0000 D1-0.0316340.002833-11.167040.0000 D1(-1)-0.0353350.002833-12.473010.0000 D1(-2)-0.0067750.002761-2.4541990.0196 R-squared0.999368 Mean dependent var5.244170 Adjusted R-squared0.999292 S.D. dependent var0.231661 S.E. of regression0.006165 Akaike info criterion-7.217678 Sum squared resid0.001254 Schwarz criterion-7.002206 Log likelihood142.1359 F-statistic13051.60 Durbin-Watson stat1.605178 Prob(F-statistic)0.000000
103
Graphical Evaluation of Fit and Error Terms RANDOM
104
SEASONAL PART OF THE MODEL DYNAMIC PART OF THE MODEL COLLEGE TUITION PRICE INDEX GENERALIZED AUTOREGRESSIVE TREND ESTIMATION Dependent Variable: LFEE Method: Least Squares Sample(adjusted): 1987:1 1995:4 Included observations: 36 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C 0.0487520.0241142.0217600.0529 LFEE(-1) 1.1263660.1829706.1560100.0000 LFEE(-2) 0.2921520.2564881.1390510.2643 LFEE(-3) -0.3449630.253185-1.3624910.1839 LFEE(-4) -0.0768550.181751-0.4228570.6756 D1 -0.0438790.005597-7.8401180.0000 D1(-1) -0.0485620.010241-4.7420400.0001 D1(-2) -0.0053690.009855-0.5448140.5902 R-squared0.999502 Mean dependent var5.263841 Adjusted R-squared0.999377 S.D. dependent var0.221681 S.E. of regression0.005532 Akaike info criterion-7.363447 Sum squared resid0.000857 Schwarz criterion-7.011554 Log likelihood140.5420 F-statistic8025.362 Durbin-Watson stat1.892211 Prob(F-statistic)0.000000
105
GAP SALES FORECAST
106
SIMPLE AUTOREGRESSIVE REGRESSION MODEL Dependent Variable: LSALES Method: Least Squares Sample(adjusted): 1985:2 1999:4 Included observations: 59 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C0.6131600.4841631.2664330.2105 LSALES(-1)0.9587140.03612826.536230.0000 R-squared0.925115 Mean dependent var13.43549 Adjusted R-squared0.923802 S.D. dependent var0.848687 S.E. of regression0.234272 Akaike info criterion-0.031358 Sum squared resid3.128350 Schwarz criterion0.039067 Log likelihood2.925062 F-statistic704.1714 Durbin-Watson stat2.159164 Prob(F-statistic)0.000000 SEASONALITY IS NOT MODELLED
107
Graphical Evaluation of Fit and Error Terms NOT RANDOM
108
AUTOREGRESSIVE REGRESSION MODEL WITH SEASONAL DUMMIES Dependent Variable: LSALES Method: Least Squares Sample(adjusted): 1985:3 1999:4 Included observations: 58 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C0.2997340.1115642.6866560.0096 LSALES(-1)0.9944730.008213121.08730.0000 D1-0.5472510.018685-29.287660.0000 D1(-1)-0.1754050.018732-9.3641260.0000 D1(-2)0.0332810.0184581.8030730.0771 R-squared0.996547 Mean dependent var13.46547 Adjusted R-squared0.996287 S.D. dependent var0.823972 S.E. of regression0.050210 Akaike info criterion-3.062940 Sum squared resid0.133616 Schwarz criterion-2.885316 Log likelihood93.82526 F-statistic3824.335 Durbin-Watson stat1.828642 Prob(F-statistic)0.000000
109
Graphical Evaluation of Fit and Error Terms RANDOM
110
ALTERNATIVE SEASONAL MODELLING FOR NONSEASONAL DATA, THE AUTOREGRESSIVE MODEL CAN BE WRITTEN AS IF THE LENGTH OF THE SEASONALITY IS S, THE SESONAL AUTOREGRESSIVE MODEL CAN BE WRITTEN AS
111
SEASONAL LAGGED AUTOREGRESSIVE REGRESSION MODEL Dependent Variable: LSALES Method: Least Squares Sample(adjusted): 1986:1 1999:4 Included observations: 56 after adjusting endpoints VariableCoefficientStd. Errort-StatisticProb. C0.3299800.1694851.9469530.0567 LSALES(-4)0.9908770.01272077.899490.0000 R-squared0.991180 Mean dependent var3.50893 Adjusted R-squared0.991016 S.D. dependent var0.804465 S.E. of regression0.076248 Akaike info criterion2.274583 Sum squared resid0.313945 Schwarz criterion-2.202249 Log likelihood65.68834 F-statistic6068.330 Durbin-Watson stat0.434696 Prob(F-statistic)0.000000
112
Graphical Evaluation of Fit and Error Terms
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.