DECISION MODELING WITH MICROSOFT EXCEL Chapter 13 Copyright 2001 Prentice Hall Publishers and Ardith E. Baker Part 1.

Many important decisions made by individuals and organizations crucially depend on an assessment of the_________. There are a few “____” sayings that illustrate the promise and frustration of forecasting: “It is difficult to_________, especially in regards to the future.” “It isn’t difficult to forecast, just to forecast ___________.” “_______, if tortured enough, will confess to just about anything.”

Forecasting is playing an increasingly important role in the_______________. Economic forecasts__________ Government policies and business decisions Insurance companies’ ___________decisions in mortgages and bonds Service industries’ (such as airlines, hotels, rental cars, cruise lines, etc.) forecasts of _______as input for revenue management There is clearly a steady __________in the use of quantitative forecasting models at many levels in industry and government. The many types of forecasting models will be distributed into two major techniques: ___________and____________

____________forecasting models possess two important and attractive features: 1. They are expressed in mathematical ________. Thus, they establish an unambiguous record of how the forecast is made. 2. With the use of _______________and computers, quantitative models can be based on an amazing quantity of data. Two types of quantitative forecasting models that will be discussed in the next two sections are: ________models and _________models

In a _______forecasting model, the forecast for the quantity of interest “rides piggyback” on another quantity or set of quantities. In other words, our ________of the value of one variable (or perhaps several variables) enables us to forecast the value of another___________. In this model, let y denote the _________of some variable of interest and y denote a predicted or _________value for that variable. ^

Then, in a causal model, where f is a forecasting________, or function, and x 1, x 2, … x i, is a set of variables y = f(x 1, x 2, … x n ) ^ In this representation, the x variables are often called _________variables, whereas y is the dependent or __________variable. ^ We either _______the independent variables in advance or can forecast them more easily than y. ^ Then the independent variables will be used in the forecasting model to forecast the __________ variable.

Companies often find by looking at past __________that their monthly sales are directly related to the monthly______, and thus figure that a good forecast could be made using next month’s GDP figure. The only problem is that this quantity is not _______, or it may just be a forecast and thus not a truly independent___________. To use a causal forecasting model, requires two conditions: 1. There must be a ___________between values of the independent and dependent variables such that the former provides ____________about the latter.

2. The _______for the independent variables must be known and available to the forecaster at the ____the forecast is made. Simply because there is a mathematical relationship does not ___________that there is really cause and effect. One commonly used approach in creating a causal forecasting model is called_____________. Consider an oil company that is planning to expand its _________of modern self-service gasoline stations. CURVE FITTING: AN OIL COMPANY EXPANSION

The company plans to use __________(measured in the average number of cars per hour) to forecast ______(measured in average dollar sales per hour). The firm has had five stations in operation for more than a year and has used _________data to calculate the following averages:

The averages are plotted in a scatter diagram.

Now, these data will be used to construct a _________that will be used to forecast sales at any proposed location by measuring the traffic flow at that ________and plugging its value into the constructed function. Least Squares Fits The method of __________is a formal procedure for curve fitting. It is a two- step process. 1. Select a specific functional form (e.g., a ___________or quadratic curve). 2. Within the set of functions specified in step 1, choose the specific function that __________the sum of the squared deviations between the data points and the function___________.

To demonstrate the process, consider the sales- traffic flow example. 1. Assume a _______line; that is, functions of the form y = a + bx. 2. Draw the line in the ____________and indicate the __________between observed points and the function as d i. d 1 = y 1 – [a +bx 1 ] = 220 – [a + 150b] For example, where y 1 = actual sales/hr at location 1 x 1 = actual traffic flow at location 1 a = y-axis intercept for the function b = slope for the function

The value d 1 2 is one measure of __________the value of the function [a +bx 1 ] is to the ________ value, y 1 ; that is it indicates how well the function fits at this one point. d2d2d2d2 d5d5d5d5 d4d4d4d4 d1d1d1d1 d3d3d3d3 y = a + bx y x

One measure of how well the function fits overall is the sum of the __________________: di2di2di2di2  i=15 Consider a ________model with n as opposed to five____________. Since each d i = y i – (a +bx i ), the sum of the squared deviations can be written as:  i=1n (y i – [a +bx i ]) 2 Using the method of__________, select a and b so as to minimize the sum in the equation above.

Now, take the __________derivative of the sum with respect to a and set the resulting expression equal to______.  i=1n -2(y i – [a +bx i ]) = 0 A second __________is derived by following the same procedure with b.  i=1n -2x i (y i – [a +bx i ]) = 0 Recall that the values for x i and y i are the ______________, and our goal is to find the values of a and b that satisfy these two equations.

The solution is: xixixixi  i=1 n x i y i - b = 1 n  i=1 n xixixixi  i=1n yiyiyiyi  i=1 n xi2xi2xi2xi2 - 1 n  i=1 n 2 a =a =a =a = 1 n  i=1n yiyiyiyi - b 1 n  i=1 n xixixixi The next step is to determine the values for:  i=1 n xi2xi2xi2xi2  i=1 n yiyiyiyi  i=1 n xixixixi  i=1 n xiyixiyixiyixiyi Note that these _______depend only on observed data and can be found with simple arithmetic ___________or automatically using Excel’s predefined___________.

Using Excel, click on Tools – Data Analysis … In the resulting dialog, choose Regression.

In the __________dialog, enter the Y-range and X-range. Choose to place the _______in a new worksheet called Results Select ___________and Normal Probability Plots to be created along with the output.

Click OK to produce the following results: Note that a (Intercept) and b (X Variable 1) are reported as 57.104 and 0.92997, respectively.

To add the resulting ____________line, first click on the worksheet Chart 1 which contains the original_____________. Next, click on the ____________so that they are highlighted and then choose Add Trendline … from the Chart pull-down menu.

Choose Linear Trend in the resulting dialog and click OK.

A linear trend is fit to the data:

One of the other __________output values that is given in Excel is: R Square = 69.4% This is a “_________” measure which represents the R 2 statistic discussed in introductory statistics classes. R 2 ranges in value from __________and gives an indication of how much of the total ________in Y from its mean is explained by the new trend line. In fact, there are three different sums of errors: TSS (________Sum of Squares) ESS (________Sum of Squares) RSS (________Sum of Squares)

The basic relationship between them is: TSS = ESS + RSS They are defined as follows: TSS =  i=1n (Y i – Y ) 2 – ESS =  i=1n (Y i – Y i ) 2 ^  i=1n (Y i – Y ) 2 ^– RSS = Essentially, the ____is the amount of variation that can’t be explained by the___________. The ____quantity is effectively the amount of the ________, total variation (TSS) that could be removed using the regression line.

If the regression line fits________, then ESS = 0 and RSS = TSS, resulting in R 2 = 1. R2 =R2 =R2 =R2 =RSSTSS R 2 is defined as: In this example, R 2 =.694 which means that approximately 70% of the variation in the Y values is explained by the one ____________ variable (X), cars per hour.

Now, returning to the original question: Should we build a station at Buffalo Grove where traffic is 183 cars/hour? The best guess at what the corresponding _____ volume would be is found by placing this X value into the new ____________equation: Sales/hour = 57.104 + 0.92997 * (183 cars/hour) However, it would be nice to be able to state a _____confidence interval around this best guess. y = a + b * x ^ = $227.29

Excel reports that the ______________(S e ) is 44.18. This quantity represents the amount of ______in the actual data around the regression line. We can get the information to do this from Excel’s Summary Output. The formula for S e is: Se =Se =Se =Se =  i=1n (Y i – Y i ) 2 ^ n – k -1 Where n is the number of data points (e.g., 5) and k is the number of ___________variables (e.g., 1).

This equation is __________to: n – k -1 ESS Once we know S e and based on the ______ distribution, we can state that We have 68% confidence that the _____ value of sales/hour is within + 1 S e of the predicted value ($277.29).We have 68% confidence that the _____ value of sales/hour is within + 1 S e of the predicted value ($277.29). We have 95% confidence that the actual value of _____/hour is within + 2 S e of the predicted value ($277.29).We have 95% confidence that the actual value of _____/hour is within + 2 S e of the predicted value ($277.29). [277.29 – 2(44.18); 227.29 + 2(44.18)] [$138.93; $315.65] The 95% ___________interval is:

Another value of interest in the Summary report is the ____________for the X variable and its associated values. The t-statistic is 2.61 and the ________is 0.0798. A P-value less than 0.05 represents that we have at least 95% confidence that the ____parameter (b) is statistically significantly than 0 (zero). A slope of __results in a flat trend ______and indicates no relationship between Y and X. The 95% confidence limit for b is [-0.205; 2.064] Thus, we can’t _________the possibility that the true value of b might be 0.

Also given in the Summary report is the _____________. Since there is only ____ independent variable, the F –significance is identical to the P-value for the t-statistic. In the case of more than one X variable, the F – significance tests the ___________that all the X variable parameters as a group are statistically significantly different than zero.

Concerning multiple regression________, as you add other X variables, the R 2 statistic will always _______, meaning the RSS has increased. In this case, the _________ R 2 statistic is a reliable __________of the true goodness of fit because it compensates for the reduction in the ____due to the addition of more independent variables. Thus, it may report a _________adjusted R 2 value even though R 2 has increased, unless the improvement in ____is more than compensated for by the __________of the new independent variables.

Fitting a Quadratic Function The method of least ________can be used with any number of independent variables and with any _________ form (not just linear). Suppose that we wish to fit a _________function of the form y = a 0 + a 1 x + a 2 x 2 to the previous data with the method of least squares. The goal is to select a 0, a 1, and a 2 in order to __________the sum of squared deviations, which is now  i=15 (y i – [a 0 + a 1 x i + a 2 x i 2 ]) 2

Proceed by setting the partial ____________with respect to a 0, a 1, and a 2 equal to______. This gives the equations 5a 0 + (  x i )a 1 + (  x i 2 )a 2 =  y i (  x i )a 0 + (  x i 2 )a 1 + (  x i 3 )a 2 =  x i y i (  x i 2 )a 0 + (  x i 3 )a 1 + (  x i 4 )a 2 =  x i 2 y i This is a simple set of three linear equations in three__________. Thus, the general name for this least squares curve fitting is “___________________.” The term _________comes from the fact that simultaneous linear equations are being solved.

Solver will be used to find the coefficients in Excel. Consider the following worksheet:

Now, to find the ____________values for the parameters (a 0, a 1, and a 2 ) using________, first click on Tools – Solver.

In the resulting Solver Parameter dialog, specify the following settings: Click Solve to solve the____________, nonlinear optimization model. In this model, the objective function is to minimize the sum of_______________.

Here are the Solver results. The parameter values are: This formula calculates the sum of squared errors directly.

Use Excel’s Chart Wizard to plot the _______data and the resulting ___________function. First, highlight the original range of data, then click on the ______________button.

Use Excel’s Chart Wizard to plot the original data as a __________and specify a quadratic function via the Chart – Add Trendline … option.

Comparing the Linear and Quadratic Fits In the method of least squares, the _____of the squared deviations was selected as the measure of “______________.” Thus, the linear and quadratic fits can be compared with this___________. In order to make this comparison, go back to the linear regression “________” spreadsheet and make the corresponding calculation in the original “______” spreadsheet.

Note that the sum of the squared deviations for the ________function is indeed smaller than that for the ______function (i.e., 4954 < 5854.7). Indeed, the quadratic gives roughly a 15% __________in the sum of squared deviations. It follows then: the best quadratic function must be _______as good as the best linear function. A linear function is a special type of ________ function in which a 2 = 0.

If a quadratic function is at least as good as a linear function, why not choose a more ________ form, thereby getting an even better_____? WHICH CURVE TO FIT? In practice, _______of the form (with only a single independent variable for illustrative purposes) are often suggested: y = a 0 + a 1 x + a 2 x 2 + … + a n x n Such a function is called a _________of degree n, and it represents a broad and flexible class of functions. n = 2quadratic n = 3cubic n = 4_______ …

One must proceed with __________when fitting data with a ___________function. For example, it is possible to find a (k – 1)-degree polynomial that will _________fit k data points. To be more specific, suppose we have seven _________observations, denoted (x i, y i ), i = 1, 2, …, 7 It is possible to find a ____________polynomial y = a 0 + a 1 x + a 2 x 2 + … + a 6 x 6 that exactly passes through each of these seven data points.

A perfect fit gives ______for the sum of squared deviations. However, this is ________, for it does not imply much about the _________ value of the model for use in future forecasting.

Despite the ________of the polynomial function, the forecast is very_______. The linear fit might provide more __________forecasts. Also, note that the polynomial fit has __________ extrapolation properties (i.e., the polynomial “_________” at its extremes).

One way of finding which fit is truly “better” is to use a different standard of_____________, the “mean squared error” or MSE. MSE = sum of squared errors (# of points – # of parameters) For the___________, the number of parameters estimated is 2 (a, b) MSE = 5854(5-2) = 1951.3 MSE = 4954(5-3) = 2477.0 For the quadratic fit

So, the MSE gets ______in this case even though the total sum of squares will always be less or the same for a ___________fit. When there is a_________, both the total sum of squares and the MSE will be_____. Because of this, most forecasting programs will fit only up through a _____polynomial, since higher degrees don’t reflect the general trend of ______data.

What is a Good Fit? A good historical fit may have poor _______power. So what is a good fit? It depends on whether one has some idea about the _________real-world process that relates the y’s and x’s. To be an __________forecasting device, the forecasting function must to some extent capture important ________of that process. The more one knows, the _______one can do. However, knowledge of the underlying process is typically phrased in__________ language. For example, linear curve fitting, in the statistical context, is called______________.

If the statistical _____________about the linear regression model are precisely satisfied (e.g., errors are _________distributed around the regression line), then in a precise and well- defined sense, statisticians can prove that the linear fit is the “______possible fit.” In the real world one can never be completely certain about the ____________process. The question then becomes: How much ___________can we have that the underlying process is one that satisfies a particular set of statistical____________? Fortunately, statistical analysis can reveal how well the _________data do indeed satisfy those assumptions.

And if it does not satisfy the assumptions, then try a different________. Remember, there is an underlying real-world _________and the model is a selective ___________________of that problem. How good is that model? Ideally, to test the goodness of a model, one would like to have considerable ____________with its use. If, in repeated use, it is observed that the model performs well, then our confidence is________. However, what confidence can we have at the outset, without experience?

Validating Models One_________, is to ask the question: Suppose the model had been used to make past decisions; how well would the firm have fared? This approach “creates” experience by ________ the past. This is often referred to as _________of the model. Typically, one uses only a ______of the historical data to create the model – for example, to fit a polynomial of a specified degree. One can then use the remaining _____to see how well the model would have performed.

End of Part 1 Please continue to Part 2

DECISION MODELING WITH MICROSOFT EXCEL Chapter 13 Copyright 2001 Prentice Hall Publishers and Ardith E. Baker Part 1.

Similar presentations

Presentation on theme: "DECISION MODELING WITH MICROSOFT EXCEL Chapter 13 Copyright 2001 Prentice Hall Publishers and Ardith E. Baker Part 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

DECISION MODELING WITH MICROSOFT EXCEL Chapter 13 Copyright 2001 Prentice Hall Publishers and Ardith E. Baker Part 1.

Similar presentations

Presentation on theme: "DECISION MODELING WITH MICROSOFT EXCEL Chapter 13 Copyright 2001 Prentice Hall Publishers and Ardith E. Baker Part 1."— Presentation transcript:

Similar presentations

About project

Feedback