Finance 30210: Managerial Economics

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Forecasting OPS 370.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Economics of the Firm Consumer Demand Analysis. Today’s Plan: Motivation Refresher on Probability and Statistics Refresher on Regression analysis Example.
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
Econ 140 Lecture 121 Prediction and Fit Lecture 12.
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Economics of the Firm Some Introductory Material.
Ch. 14: The Multiple Regression Model building
Finance 30210: Managerial Economics Demand Forecasting.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Introduction to Regression Analysis, Chapter 13,
Slides 13b: Time-Series Models; Measuring Forecast Error
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Inference for regression - Simple linear regression
Non-continuous Relationships If the relationship between the dependent variable and an independent variable is non-continuous a slope dummy variable can.
Demand Estimation and Forecasting Finance 30210: Managerial Economics.
Forecasting Techniques: Single Equation Regressions Su, Chapter 10, section III.
Sample-Based Epidemiology Concepts Infant Mortality in the USA (1991) Infant Mortality in the USA (1991) UnmarriedMarriedTotal Deaths16,71218,78435,496.
DSc 3120 Generalized Modeling Techniques with Applications Part II. Forecasting.
Time Series Analysis and Forecasting
Managerial Economics Demand Estimation & Forecasting.
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
Time series Decomposition Farideh Dehkordi-Vakil.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
4-1 Operations Management Forecasting Chapter 4 - Part 2.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Welcome to MM305 Unit 5 Seminar Prof Greg Forecasting.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
FORECASTIN G Qualitative Analysis ~ Quantitative Analysis.
Economics of the Firm Consumer Demand Analysis. Demand relationships are based off of the theory of consumer choice. We can characterize the average consumer.
Forecast 2 Linear trend Forecast error Seasonal demand.
Welcome to MM305 Unit 5 Seminar Forecasting. What is forecasting? An attempt to predict the future using data. Generally an 8-step process 1.Why are you.
Stats Methods at IC Lecture 3: Regression.
Regression Analysis.
Chapter 4: Basic Estimation Techniques
Chapter 14 Introduction to Multiple Regression
QM222 Class 9 Section A1 Coefficient statistics
Chapter 4 Basic Estimation Techniques
QM222 Class 10 Section D1 1. Goodness of fit -- review 2
Forecasting Methods Dr. T. T. Kachwala.
Basic Estimation Techniques
Mixed Costs Chapter 2: Managerial Accounting and Cost Concepts. In this chapter we explain how managers need to rely on different cost classifications.
Chapter 11 Simple Regression
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
By C. Kohn Waterford Agricultural Sciences
Basic Estimation Techniques
Introduction to Summary Statistics
Introduction to Summary Statistics
Inferential Statistics
Demand Estimation and Forecasting
Geology Geomath Chapter 7 - Statistics tom.h.wilson
Forecasting Qualitative Analysis Quantitative Analysis.
Chapter 3 Statistical Concepts.
Normal Distribution Z-distribution.
Product moment correlation
DSS-ESTIMATING COSTS Cost estimation is the process of estimating the relationship between costs and cost driver activities. We estimate costs for three.
Some Introductory Material
SIMPLE LINEAR REGRESSION
Regression Forecasting and Model Building
OUTLINE Questions? Quiz Go over homework Next homework Forecasting.
Linear Regression and Correlation
MGS 3100 Business Analysis Regression Feb 18, 2016
Forecasting Plays an important role in many industries
Presentation transcript:

Finance 30210: Managerial Economics Demand Estimation and Forecasting

What are the odds that a fair coin flip results in a head? What are the odds that the toss of a fair die results in a 5? What are the odds that tomorrow’s temperature is 95 degrees?

The answer to all these questions come from a probability distribution 1/2 Head Tail A probability distribution is a collection of probabilities describing the odds of any particular event Probability 1/6 1 2 3 4 5 6

The distribution for temperature in south bend is a bit more complicated because there are so many possible outcomes, but the concept is the same Probability Standard Deviation Temperature Mean We generally assume a Normal Distribution which can be characterized by a mean (average) and standard deviation (measure of dispersion)

Without some math, we can’t find the probability of a specific outcome, but we can easily divide up the distribution Probability Temperature Mean-2SD Mean -1SD Mean Mean+1SD Mean+2SD 2.5% 13.5% 34% 34% 13.5% 2.5%

Can’t we do a little better than this? Annual Temperature in South Bend has a mean of 59 degrees and a standard deviation of 18 degrees. Probability 95 degrees is 2 standard deviations to the right – there is a 2.5% chance the temperature is 95 or greater (97.5% chance it is cooler than 95) Temperature 23 41 59 77 95 Can’t we do a little better than this?

Conditioning on month gives us a more accurate probabilities! Conditional distributions give us probabilities conditional on some observable information – the temperature in South Bend conditional on the month of July has a mean of 84 with a standard deviation of 7. Probability 95 degrees falls a little more than one standard deviation away (there approximately a 16% chance that the temperature is 95 or greater) Temperature 70 77 84 91 95 98 Conditioning on month gives us a more accurate probabilities!

We know that there should be a “true” probability distribution that governs the outcome of a coin toss (assuming a fair coin) The Truth Suppose that we were to flip a coin over and over again and after each flip, we calculate the percentage of heads & tails (Sample Statistic) (True Probability) That is, if we collect “enough” data, we can eventually learn the truth!

The Truth Temperature ~ We can follow the same process for the temperature in South Bend The Truth Temperature ~ We could find this distribution by collecting temperature data for south bend Sample Mean (Average) Sample Variance Note: Standard Deviation is the square root of the variance.

= 3 X Some useful properties of probability distributions Probability distributions are scalable = 3 X Mean = 1 Variance = 4 Std. Dev. = 2 Mean = 3 Variance = 36 (3*3*4) Std. Dev. = 6

= + Probability distributions are additive COV = 2 Mean = 1 Variance = 1 Std. Dev. = 1 Mean = 2 Variance = 9 Std. Dev. = 3 Mean = 3 Variance = 14 (1 + 9 + 2*2) Std. Dev. = 3.7 COV = 2

Suppose we know that the value of a car is determined by its age The Truth Value = $20,000 - $1,000 (Age) Value Car Age Mean = 8 Variance = 4 Std. Dev. = 2 Mean = $ 12,000 Variance = 4,000,000 Std. Dev. = $ 2,000

The Truth How much should a six year old car be worth? We could also use this to forecast: The Truth Value = $20,000 - $1,000 (Age) How much should a six year old car be worth? Value = $20,000 - $1,000 (6) = $14,000 Note: There is NO uncertainty in this prediction.

Searching for the truth…. You believe that there is a relationship between age and value, but you don’t know what it is…. Collect data on values and age Estimate the relationship between them Note that while the true distribution of age is N(8,4), our collected sample will not be N(8,4). This sampling error will create errors in our estimates!!

Value = a + b * (Age) + error Slope = b a Value = a + b * (Age) + error We want to choose ‘a’ and ‘b’ to minimize the error!

We have our estimate of “the truth” Regression Results Variable Coefficients Standard Error t Stat Intercept 12,354 653 18.9 Age - 854 80 -10.60 We have our estimate of “the truth” T-Stats bigger than 2 in absolute value are considered statistically significant! Value = $12,354 - $854 * (Age) + error Intercept (a) Mean = $12,354 Std. Dev. = $653 Age (b) Mean = -$854 Std. Dev. = $80

Regression Statistics R Squared 0.36 Standard Error 2250 Percentage of value variance explained by age Error Term Mean = 0 Std. Dev = $2,250

We can now forecast the value of a 6 year old car Value = $12,354 - $854 * (Age) + error Mean = $12,354 Std. Dev. = $653 Mean = $854 Std. Dev. = $ 80 Mean = $0 Std. Dev. = $2,250 (Recall, The Average Car age is 8 years)

Value +95% Forecast Interval -95% Age Note that your forecast error will always be smallest at the sample mean! Also, your forecast gets worse at an increasing rate as you depart from the mean

What are the odds that Pat Buchanan received 3,407 votes from Palm Beach County in 2000?

The Strategy: Estimate a relationship for Pat Buchanan’s votes using every county EXCEPT Palm Beach Using Palm Beach data, forecast Pat Buchanan’s vote total for Palm Beach “Are a function of” Observable Demographics Pat Buchanan’s Votes

The Data: Demographic Data By County County Black (%) Age 65 (%) Hispanic (%) College (%) Income (000s) Buchanan Votes Total Votes Alachua 21.8 9.4 4.7 34.6 26.5 262 84,966 Baker 16.8 7.7 1.5 5.7 27.6 73 8,128 What variables do you think should affect Pat Buchanan’s Vote total? # of Buchanan votes % of County that is college educated # of votes gained/lost for each percentage point increase in college educated population

Results Parameter a b Value 5.35 14.95 Standard Error 58.5 3.84 T-Statistic .09 3.89 R-Square = .19 19% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of 15 and a standard deviation of 4 Each percentage point increase in college educated (i.e. from 10% to 11%) raises Buchanan’s vote total by 15 15 There is a 95% chance that the value for ‘b’ lies between 23 and 7 Plug in Values for College % to get vote predictions County College (%) Predicted Votes Actual Votes Error Alachua 34.6 522 262 260 Baker 5.7 90 73 17

Lets try something a little different… County College (%) Buchanan Votes Log of Buchanan Votes Alachua 34.6 262 5.57 Baker 5.7 73 4.29 Lets try something a little different… Log of Buchanan votes % of County that is college educated Percentage increase/decease in votes for each percentage point increase in college educated population

Results Parameter a b Value 3.45 .09 Standard Error .27 .02 T-Statistic 12.6 5.4 R-Square = .31 31% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of .09 and a standard deviation of .02 Each percentage point increase in college educated (i.e. from 10% to 11%) raises Buchanan’s vote total by .09% .09 There is a 95% chance that the value for ‘b’ lies between .13 and .05 Plug in Values for College % to get vote predictions County College (%) Predicted Votes Actual Votes Error Alachua 34.6 902 262 640 Baker 5.7 55 73 -18

How about this… County College (%) Buchanan Votes Log of College (%) Alachua 34.6 262 3.54 Baker 5.7 73 1.74 # of Buchanan votes Log of % of County that is college educated Gain/ Loss in votes for each percentage increase in college educated population

Results Parameter a b Value -424 252 Standard Error 139 54 T-Statistic -3.05 4.6 R-Square = .25 25% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of 252 and a standard deviation of 54 Each percentage increase in college educated (i.e. from 30% to 30.3%) raises Buchanan’s vote total by 252 votes .09 There is a 95% chance that the value for ‘b’ lies between 360 and 144 Plug in Values for College % to get vote predictions County College (%) Predicted Votes Actual Votes Error Alachua 34.6 469 262 207 Baker 5.7 15 73 -58

One More… County College (%) Buchanan Votes Log of College (%) Log of Buchanan Votes Alachua 34.6 262 3.54 5.57 Baker 5.7 73 1.74 4.29 Log of Buchanan votes Log of % of County that is college educated Percentage gain/Loss in votes for each percentage increase in college educated population

Results Parameter a b Value .71 1.61 Standard Error .63 .24 T-Statistic 1.13 6.53 R-Square = .40 40% of the variation in Buchanan’s votes across counties is explained by college education The distribution for ‘b’ has a mean of 1.61 and a standard deviation of .24 Each percentage increase in college educated (i.e. from 30% to 30.3%) raises Buchanan’s vote total by 1.61% .09 There is a 95% chance that the value for ‘b’ lies between 2 and 1.13 Plug in Values for College % to get vote predictions County College (%) Predicted Votes Actual Votes Error Alachua 34.6 624 262 362 Baker 5.7 34 73 -39

It turns out the regression with the best fit looks like this. County Black (%) Age 65 (%) Hispanic (%) College (%) Income (000s) Buchanan Votes Total Votes Alachua 21.8 9.4 4.7 34.6 26.5 262 84,966 Baker 16.8 7.7 1.5 5.7 27.6 73 8,128 Error term Buchanan Votes *100 Total Votes Parameters to be estimated

Now, we can make a forecast! Variable Coefficient Standard Error t - statistic Intercept 2.146 .396 5.48 Black (%) -.0132 .0057 -2.88 Age 65 (%) -.0415 -5.93 Hispanic (%) -.0349 .0050 -6.08 College (%) -.0193 .0068 -1.99 Income (000s) -.0658 .00113 -4.58 The Results: R Squared = .73 Now, we can make a forecast! County Black (%) Age 65 (%) Hispanic (%) College (%) Income (000s) Buchanan Votes Total Votes Alachua 21.8 9.4 4.7 34.6 26.5 262 84,966 Baker 16.8 7.7 1.5 5.7 27.6 73 8,128 County Predicted Votes Actual Votes Error Alachua 520 262 258 Baker 55 73 -18

County Black (%) Age 65 (%) Hispanic (%) College (%) Income (000s) Buchanan Votes Total Votes Palm Beach 21.8 23.6 9.8 22.1 33.5 3,407 431,621 This would be our prediction for Pat Buchanan’s vote total!

We know that the log of Buchanan’s vote percentage is distributed normally with a mean of -2.004 and with a standard deviation of .2556 Probability LN(%Votes) -2.004 – 2*(.2556) -2.004 + 2*(.2556) = -2.5152 = -1.4928 There is a 95% chance that the log of Buchanan’s vote percentage lies in this range

Next, lets convert the Logs to vote percentages Probability % of Votes There is a 95% chance that Buchanan’s vote percentage lies in this range

Finally, we can convert to actual votes Probability 3,407 votes turns out to be 7 standard deviations away from our forecast!!! Votes There is a 95% chance that Buchanan’s total vote lies in this range

We know that the quantity of some good or service demanded should be related to some basic variables “ Is a function of” Quantity Demanded Price Price Other “Demand Shifters” Income Quantity

Cross Sectional estimation holds the time period constant and estimates the variation in demand resulting from variation in the demand factors Demand Factors Time t-1 t+1 t For example: can we estimate demand for Pepsi in South Bend by looking at selected statistics for South bend

Suppose that we have the following data for sales in 200 different Indiana cities City Price Average Income (Thousands) Competitor’s Price Advertising Expenditures (Thousands) Total Sales Granger 1.02 21.934 1.48 2.367 9,809 Mishawaka 2.56 35.796 2.53 26.922 130,835 Lets begin by estimating a basic demand curve – quantity demanded is a linear function of price. Change in quantity demanded per $ change in price (to be estimated)

Regression Statistics That is, we have estimated the following equation Regression Results Variable Coefficient Standard Error t Stat Intercept 155,042 18,133 8.55 Price (X) -46,087 7214 -6.39 Regression Statistics R Squared .17 Standard Error 48,074 Every dollar increase in price lowers sales by 46,087 units.

Values For South Bend Price of Pepsi $1.37 $1.37 91,903

Adding logs changes the interpretation of the coefficients As we did earlier, we can experiment with different functional forms by using logs City Price Average Income (Thousands) Competitor’s Price Advertising Expenditures (Thousands) Total Sales Granger 1.02 21.934 1.48 2.367 9,809 Mishawaka 2.56 35.796 2.53 26.922 130,835 Adding logs changes the interpretation of the coefficients Change in quantity demanded per percentage change in price (to be estimated)

Regression Statistics That is, we have estimated the following equation Regression Results Variable Coefficient Standard Error t Stat Intercept 133,133 14,892 8.93 Price (X) -103,973 16,407 -6.33 Regression Statistics R Squared .17 Standard Error 48,140 Every 1% increase in price lowers sales by 103,973 units.

Values For South Bend $1.37 100,402 Price of Pepsi $1.37 Log of Price .31 $1.37 100,402

Adding logs changes the interpretation of the coefficients As we did earlier, we can experiment with different functional forms by using logs City Price Average Income (Thousands) Competitor’s Price Advertising Expenditures (Thousands) Total Sales Granger 1.02 21.934 1.48 2.367 9,809 Mishawaka 2.56 35.796 2.53 26.922 130,835 Adding logs changes the interpretation of the coefficients Percentage change in quantity demanded per $ change in price (to be estimated)

Regression Statistics That is, we have estimated the following equation Regression Results Variable Coefficient Standard Error t Stat Intercept 13 .34 38.1 Price (X) -1.22 .13 -8.98 Regression Statistics R Squared .28 Standard Error .90 Every $1 increase in price lowers sales by 1.22%.

Values For South Bend Price of Pepsi $1.37 We can now use this estimated demand curve along with price in South Bend to estimate demand in South Bend $1.37 83,283

Adding logs changes the interpretation of the coefficients As we did earlier, we can experiment with different functional forms by using logs City Price Average Income (Thousands) Competitor’s Price Advertising Expenditures (Thousands) Total Sales Granger 1.02 21.934 1.48 2.367 9,809 Mishawaka 2.56 35.796 2.53 26.922 130,835 Adding logs changes the interpretation of the coefficients Percentage change in quantity demanded per percentage change in price (to be estimated)

Regression Statistics That is, we have estimated the following equation Regression Results Variable Coefficient Standard Error t Stat Intercept 12.3 .28 42.9 Price (X) -2.60 .31 -8.21 Regression Statistics R Squared .25 Standard Error .93 Every 1% increase in price lowers sales by 2.6%.

Values For South Bend $1.37 72,402 Price of Pepsi $1.37 Log of Price .31 $1.37 72,402

We can add as many variables as we want in whatever combination We can add as many variables as we want in whatever combination. The goal is to look for the best fit. % change in Sales per $ change in price % change in Sales per % change in income % change in Sales per % change in competitor’s price Regression Results Variable Coefficient Standard Error t Stat Intercept 5.98 1.29 4.63 Price -1.29 .12 -10.79 Log of Income 1.46 .34 4.29 Log of Competitor’s Price 2.00 5.80 R Squared: .46

Values For South Bend Price of Pepsi $1.37 Log of Income 3.81 Log of Competitor’s Price .80 Now we can make a prediction and calculate elasticities $1.37 87,142

We could use a cross sectional regression to forecast quantity demanded out into the future, but it would take a lot of information! Demand Factors Time t-1 t+1 t Estimate a demand curve using data at some point in time Use the estimated demand curve and forecasts of data to forecast quantity demanded

Time Series estimation ignores the demand factors and estimates the variation in demand over time For example: can we predict demand for Pepsi in South Bend next year by looking at how demand varies across time

Time series estimation ignores the demand factors and looks at variations in demand over time. Essentially, we want to separate demand changes into various frequencies Trend: Long term movements in demand (i.e. demand for movie tickets grows by an average of 6% per year) Business Cycle: Movements in demand related to the state of the economy (i.e. demand for movie tickets grows by more than 6% during economic expansions and less than 6% during recessions) Seasonal: Movements in demand related to time of year. (i.e. demand for movie tickets is highest in the summer and around Christmas

Time Period Quantity (millions of kilowatt hours) 2003:1 11 2003:2 15 2003:3 12 2003:4 14 2004:1 2004:2 17 2004:3 13 2004:4 16 2005:1 2005:2 18 2005:3 2005:4 2006:1 2006:2 20 2006:3 2006:4 19 Suppose that you work for a local power company. You have been asked to forecast energy demand for the upcoming year. You have data over the previous 4 years:

First, let’s plot the data…what do you see? This data seems to have a linear trend

A linear trend takes the following form: Estimated value for time zero Estimated quarterly growth (in millions of kilowatt hours) Forecasted value at time t (note: time periods are quarters and time zero is 2003:1) Time period: t = 0 is 2003:1 and periods are quarters

Regression Statistics Regression Results Variable Coefficient Standard Error t Stat Intercept 11.9 .953 12.5 Time Trend .394 .099 4.00 Regression Statistics R Squared .53 Standard Error 1.82 Observations 16 Lets forecast electricity usage at the mean time period (t = 8)

Here’s a plot of our regression line with our error bands…again, note that the forecast error will be lowest at the mean time period T = 8

We can use this linear trend model to predict as far out as we want, but note that the error involved gets worse! Sample

Lets take another look at the data…it seems that there is a regular pattern… Q2 Q2 Q2 Q2 There appears to be a seasonal cycle…

One seasonal adjustment process is to adjust each quarter by the average of actual to predicted Time Period Actual Predicted Ratio Adjusted 2003:1 11 12.29 .89 12.29(.87)=10.90 2003:2 15 12.68 1.18 12.68(1.16) = 14.77 2003:3 12 13.08 .91 13.08(.91) = 11.86 2003:4 14 13.47 1.03 13.47(1.04) = 14.04 2004:1 13.87 .87 13.87(.87) = 12.30 2004:2 17 14.26 1.19 14.26(1.16) = 16.61 2004:3 13 14.66 .88 14.66(.91) = 13.29 2004:4 16 15.05 1.06 15.05(1.04) = 15.68 2005:1 15.44 15.44(.87) = 13.70 2005:2 18 15.84 1.14 15.84(1.16) = 18.45 2005:3 16.23 .92 16.23(.91) = 14.72 2005:4 16.63 1.02 16.63(1.04) = 17.33 2006:1 17.02 17.02(.87) = 15.10 2006:2 20 17.41 17.41(1.16) = 20.28 2006:3 17.81 17.81(.91) = 16.15 2006:4 19 18.20 1.04 18.20(1.04) = 18.96 Average Ratios Q1 = .87 Q2 = 1.16 Q3 = .91 Q4 = 1.04 For each observation: Calculate the ratio of actual to predicted Average the ratios by quarter Use the average ration to adjust each predicted value

Sum of squared forecast errors With the seasonal adjustment, we don’t have any statistics to judge goodness of fit. One method of evaluating a forecast is to calculate the root mean squared error Time Period Actual Adjusted Error 2003:1 11 10.90 -0.1 2003:2 15 14.77 -0.23 2003:3 12 11.86 -0.14 2003:4 14 14.04 0.04 2004:1 12.30 0.3 2004:2 17 16.61 -0.39 2004:3 13 13.29 0.29 2004:4 16 15.68 -0.32 2005:1 13.70 -0.3 2005:2 18 18.45 0.45 2005:3 14.72 -0.28 2005:4 17.33 0.33 2006:1 15.10 0.1 2006:2 20 20.28 0.28 2006:3 16.15 0.15 2006:4 19 18.96 -0.04 Sum of squared forecast errors Number of Observations

Looks pretty good…

Recall our prediction for period 76 ( Year 2022 Q4)

We could also account for seasonal variation by using dummy variables Note: we only need three quarter dummies. If the observation is from quarter 4, then

Regression Statistics Regression Results Variable Coefficient Standard Error t Stat Intercept 12.75 .226 56.38 Time Trend .375 .0168 22.2 D1 -2.375 .219 -10.83 D2 1.75 .215 8.1 D3 -2.125 .213 -9.93 Regression Statistics R Squared .99 Standard Error .30 Observations 16 Note the much better fit!!

Ratio Method Dummy Variables Time Period Actual Ratio Method 2003:1 11 10.90 10.75 2003:2 15 14.77 15.25 2003:3 12 11.86 11.75 2003:4 14 14.04 14.25 2004:1 12.30 12.25 2004:2 17 16.61 16.75 2004:3 13 13.29 13.25 2004:4 16 15.68 15.75 2005:1 13.70 13.75 2005:2 18 18.45 18.25 2005:3 14.72 14.75 2005:4 17.33 17.25 2006:1 15.10 2006:2 20 20.28 19.75 2006:3 16.15 16.25 2006:4 19 18.96 18.75 Ratio Method Dummy Variables

A plot confirms the similarity of the methods

Recall our prediction for period 76 ( Year 2022 Q4)

Recall, our trend line took the form… This parameter is measuring quarterly change in electricity demand in millions of kilowatt hours. Often times, its more realistic to assume that demand grows by a constant percentage rather that a constant quantity. For example, if we knew that electricity demand grew by G% per quarter, then our forecasting equation would take the form

If we wish to estimate this equation, we have a little work to do… Note: this growth rate is in decimal form If we convert our data to natural logs, we get the following linear relationship that can be estimated

Regression Statistics Regression Results Variable Coefficient Standard Error t Stat Intercept 2.49 .063 39.6 Time Trend .026 .006 4.06 Regression Statistics R Squared .54 Standard Error .1197 Observations 16 Lets forecast electricity usage at the mean time period (t = 8) BE CAREFUL….THESE NUMBERS ARE LOGS !!!

The natural log of forecasted demand is 2. 698 The natural log of forecasted demand is 2.698. Therefore, to get the actual demand forecast, use the exponential function Likewise, with the error bands…a 95% confidence interval is +/- 2 SD

Again, here is a plot of our forecasts with the error bands

Errors is growth rates compound quickly!!

Let’s try one…suppose that we are interested in forecasting gasoline prices. We have the following historical data. (the data is monthly from April 1993 – June 2010) Does a linear (constant cents per gallon growth per year) look reasonable?

Regression Results Variable Coefficient Standard Error t Stat Let’s suppose we assume a linear trend. Then we are estimating the following linear regression: monthly growth in dollars per gallon Price at time t Price at April 1993 Number of months from April 1993 Regression Results Variable Coefficient Standard Error t Stat Intercept .67 .05 12.19 Time Trend .010 .0004 23.19 R Squared= .72

Regression Results Variable Coefficient Standard Error t Stat We can check for the presence of a seasonal cycle by adding seasonal dummy variables: dollars per gallon impact of quarter I relative to quarter 4 Regression Results Variable Coefficient Standard Error t Stat Intercept .58 .07 8.28 Time Trend .01 .0004 23.7 D1 -.03 .075 -.43 D2 .15 .074 2.06 D3 .16 2.20 R Squared= .74

Regression coefficient If we wanted to remove the seasonal component, we could by subtracting the seasonal dummy off each gas price Seasonalizing Date Price Regression coefficient Seasonalized data 1993 – 04 1.05 .15 .90 1993 - 07 1.06 .16 90 1993 - 10 1994 - 01 .98 -.03 1.01 1994 - 04 1.00 .85 2nd Quarter 3rd Quarter 4th Quarter 1st Quarter 2nd Quarter

Note: Once the seasonal component has been removed, all that should be left is trend, cycle, and noise. We could check this: Seasonalized Price Series Regression Results Variable Coefficient Standard Error t Stat Intercept .587 .05 11.06 Time Trend .010 .0004 23.92 Seasonalized Price Series Regression Results Variable Coefficient Standard Error t Stat Intercept .587 .07 8.28 Time Trend .010 .0004 23.7 D1 .075 D2 .074 D3

Business Cycle Component Predicted Price (From regression) The regression we have in place gives us the trend plus the seasonal component of the data Predicted Trend Seasonal If we subtract our predicted price (from the regression) from the actual price, we will have isolated the business cycle and noise Business Cycle Component Date Actual Price Predicted Price (From regression) 1993 - 04 1.050 .752 .297 1993 - 05 1.071 .763 .308 1993 - 06 1.075 773 .301 1993 - 07 1.064 .797 .267 1993 - 08 1.048 .807 .240

We can plot this and compare it with business cycle dates Predicted Price Actual Price

Regression Results Variable Coefficient Standard Error t Stat Data Breakdown Date Actual Price Trend Seasonal Business Cycle 1993 - 04 1.050 .58 .15 .320 1993 - 05 1.071 .59 .331 1993 - 06 1.075 .60 .325 1993 - 07 1.064 .61 .16 .294 1993 - 08 1.048 .62 .268 Regression Results Variable Coefficient Standard Error t Stat Intercept .58 .07 8.28 Time Trend .01 .0004 23.7 D1 -.03 .075 -.43 D2 .15 .074 2.06 D3 .16 2.20

Perhaps an exponential trend would work better… An exponential trend would indicate constant percentage growth rather than cents per gallon.

Regression Results Variable Coefficient Standard Error t Stat We already know that there is a seasonal component, so we can start with dummy variables Percentage price impact of quarter I relative to quarter 4 Monthly growth rate Regression Results Variable Coefficient Standard Error t Stat Intercept -.14 .03 -4.64 Time Trend .005 .0001 29.9 D1 -.02 .032 -.59 D2 .06 2.07 D3 .07 2.19 R Squared= .81

Regression coefficient Log of Seasonalized data If we wanted to remove the seasonal component, we could by subtracting the seasonal dummy off each gas price, but now, the price is in logs Seasonalizing Date Price Log of Price Regression coefficient Log of Seasonalized data Seasonalized Price 1993 – 04 1.05 .049 .06 -.019 .98 1993 - 07 1.06 .062 .07 -.010 .99 1993 - 10 1994 - 01 -.013 -.02 .006 1.00 1994 - 04 .005 -.062 .94 2nd Quarter 3rd Quarter 4th Quarter 1st Quarter 2nd Quarter Example:

Business Cycle Component Predicted Log Price (From regression) The regression we have in place gives us the trend plus the seasonal component of the data Predicted Log of Price Seasonal Trend If we subtract our predicted price (from the regression) from the actual price, we will have isolated the business cycle and noise Business Cycle Component Date Actual Price Predicted Log Price (From regression) Predicted Price 1993 - 04 1.050 -.069 .93 .12 1993 - 05 1.071 -.063 .94 .13 1993 - 06 1.075 -.057 1993 – 07 1.064 -.047 .95 .11 1993 - 08 1.048 -.041 .96 .09

As you can see, very similar results Predicted Price Actual Price

In either case, we could make a forecast for gasoline prices next year In either case, we could make a forecast for gasoline prices next year. Lets say, April 2011. Forecasting Data Date Time Period Quarter April 2011 217 2 OR By the way, the actual price in April 2011 was $3.80

There doesn’t seem to be any discernable trend here… Quarter Market Share 1 20 2 22 3 23 4 24 5 18 6 7 19 8 17 9 10 11 12 There doesn’t seem to be any discernable trend here… Consider a new forecasting problem. You are asked to forecast a company’s market share for the 13th quarter.

Smoothing techniques are often used when data exhibits no trend or seasonal/cyclical component. They are used to filter out short term noise in the data. Quarter Market Share MA(3) MA(5) 1 20 2 22 3 23 4 24 21.67 5 18 6 21.4 7 19 8 17 9 19.67 20.2 10 19.33 19.8 11 20.67 20.8 12 21 A moving average of length N is equal to the average value over the previous N periods

The longer the moving average, the smoother the forecasts are…

Calculating forecasts is straightforward… MA(3) Quarter Market Share MA(3) MA(5) 1 20 2 22 3 23 4 24 21.67 5 18 6 21.4 7 19 8 17 9 19.67 20.2 10 19.33 19.8 11 20.67 20.8 12 21 MA(5) So, how do we choose N??

Total = 78.3534 Total = 62.48 Quarter Market Share MA(3) Squared Error 1 20 2 22 3 23 4 24 21.67 5.4289 5 18 25 6 1.7689 21.4 2.56 7 19 7.1289 9 8 17 19.36 19.67 20.2 3.24 10 19.33 13.4689 19.8 10.24 11 20.67 20.8 7.84 12 21 Total = 78.3534 Total = 62.48

Exponential smoothing involves a forecast equation that takes the following form Forecast for time t Forecast for time t+1 Actual value at time t Smoothing parameter Note: when w = 1, your forecast is equal to the previous value. When w = 0, your forecast is a constant.

Usually, the initial forecast is chosen to equal the sample average For exponential smoothing, we need to choose a value for the weighting formula as well as an initial forecast Quarter Market Share W=.3 W=.5 1 20 21.0 2 22 20.7 20.5 3 23 21.1 21.3 4 24 21.7 22.2 5 18 22.4 23.1 6 20.6 7 19 21.8 8 17 20.9 20.4 9 19.7 18.7 10 11 21.2 12 20.2 19.9 Usually, the initial forecast is chosen to equal the sample average

As was mentioned earlier, the smaller w will produce a smoother forecast

Calculating forecasts is straightforward… Quarter Market Share W=.3 W=.5 1 20 21.0 2 22 20.7 20.5 3 23 21.1 21.3 4 24 21.7 22.2 5 18 22.4 23.1 6 20.6 7 19 21.8 8 17 20.9 20.4 9 19.7 18.7 10 11 21.2 12 20.2 19.9 W=.5 So, how do we choose W??

Total = 87.19 Total = 101.5 Quarter Market Share W = .3 Squared Error 20 21.0 2 22 20.7 1.69 20.5 2.25 3 23 21.1 3.61 21.3 2.89 4 24 21.7 5.29 22.2 3.24 5 18 22.4 19.36 23.1 26.01 6 20.6 5.76 7 19 7.29 21.8 7.84 8 17 20.9 15.21 20.4 11.56 9 19.7 18.7 10.89 10 6.76 11 21.2 10.24 13.69 12 20.2 19.9 9.61 Total = 87.19 Total = 101.5