Download presentation
Presentation is loading. Please wait.
Published byJerome Little Modified over 9 years ago
1
DECISION MODELING WITH MICROSOFT EXCEL Chapter 13 Copyright 2001 Prentice Hall Publishers and Ardith E. Baker Part 1
2
Many important decisions made by individuals and organizations crucially depend on an assessment of the_________. There are a few “____” sayings that illustrate the promise and frustration of forecasting: “It is difficult to_________, especially in regards to the future.” “It isn’t difficult to forecast, just to forecast ___________.” “_______, if tortured enough, will confess to just about anything.”
3
Forecasting is playing an increasingly important role in the_______________. Economic forecasts__________ Government policies and business decisions Insurance companies’ ___________decisions in mortgages and bonds Service industries’ (such as airlines, hotels, rental cars, cruise lines, etc.) forecasts of _______as input for revenue management There is clearly a steady __________in the use of quantitative forecasting models at many levels in industry and government. The many types of forecasting models will be distributed into two major techniques: ___________and____________
4
____________forecasting models possess two important and attractive features: 1. They are expressed in mathematical ________. Thus, they establish an unambiguous record of how the forecast is made. 2. With the use of _______________and computers, quantitative models can be based on an amazing quantity of data. Two types of quantitative forecasting models that will be discussed in the next two sections are: ________models and _________models
5
In a _______forecasting model, the forecast for the quantity of interest “rides piggyback” on another quantity or set of quantities. In other words, our ________of the value of one variable (or perhaps several variables) enables us to forecast the value of another___________. In this model, let y denote the _________of some variable of interest and y denote a predicted or _________value for that variable. ^
6
Then, in a causal model, where f is a forecasting________, or function, and x 1, x 2, … x i, is a set of variables y = f(x 1, x 2, … x n ) ^ In this representation, the x variables are often called _________variables, whereas y is the dependent or __________variable. ^ We either _______the independent variables in advance or can forecast them more easily than y. ^ Then the independent variables will be used in the forecasting model to forecast the __________ variable.
7
Companies often find by looking at past __________that their monthly sales are directly related to the monthly______, and thus figure that a good forecast could be made using next month’s GDP figure. The only problem is that this quantity is not _______, or it may just be a forecast and thus not a truly independent___________. To use a causal forecasting model, requires two conditions: 1. There must be a ___________between values of the independent and dependent variables such that the former provides ____________about the latter.
8
2. The _______for the independent variables must be known and available to the forecaster at the ____the forecast is made. Simply because there is a mathematical relationship does not ___________that there is really cause and effect. One commonly used approach in creating a causal forecasting model is called_____________. Consider an oil company that is planning to expand its _________of modern self-service gasoline stations. CURVE FITTING: AN OIL COMPANY EXPANSION
9
The company plans to use __________(measured in the average number of cars per hour) to forecast ______(measured in average dollar sales per hour). The firm has had five stations in operation for more than a year and has used _________data to calculate the following averages:
10
The averages are plotted in a scatter diagram.
11
Now, these data will be used to construct a _________that will be used to forecast sales at any proposed location by measuring the traffic flow at that ________and plugging its value into the constructed function. Least Squares Fits The method of __________is a formal procedure for curve fitting. It is a two- step process. 1. Select a specific functional form (e.g., a ___________or quadratic curve). 2. Within the set of functions specified in step 1, choose the specific function that __________the sum of the squared deviations between the data points and the function___________.
12
To demonstrate the process, consider the sales- traffic flow example. 1. Assume a _______line; that is, functions of the form y = a + bx. 2. Draw the line in the ____________and indicate the __________between observed points and the function as d i. d 1 = y 1 – [a +bx 1 ] = 220 – [a + 150b] For example, where y 1 = actual sales/hr at location 1 x 1 = actual traffic flow at location 1 a = y-axis intercept for the function b = slope for the function
13
The value d 1 2 is one measure of __________the value of the function [a +bx 1 ] is to the ________ value, y 1 ; that is it indicates how well the function fits at this one point. d2d2d2d2 d5d5d5d5 d4d4d4d4 d1d1d1d1 d3d3d3d3 y = a + bx y x
14
One measure of how well the function fits overall is the sum of the __________________: di2di2di2di2 i=15 Consider a ________model with n as opposed to five____________. Since each d i = y i – (a +bx i ), the sum of the squared deviations can be written as: i=1n (y i – [a +bx i ]) 2 Using the method of__________, select a and b so as to minimize the sum in the equation above.
15
Now, take the __________derivative of the sum with respect to a and set the resulting expression equal to______. i=1n -2(y i – [a +bx i ]) = 0 A second __________is derived by following the same procedure with b. i=1n -2x i (y i – [a +bx i ]) = 0 Recall that the values for x i and y i are the ______________, and our goal is to find the values of a and b that satisfy these two equations.
16
The solution is: xixixixi i=1 n x i y i - b = 1 n i=1 n xixixixi i=1n yiyiyiyi i=1 n xi2xi2xi2xi2 - 1 n i=1 n 2 a =a =a =a = 1 n i=1n yiyiyiyi - b 1 n i=1 n xixixixi The next step is to determine the values for: i=1 n xi2xi2xi2xi2 i=1 n yiyiyiyi i=1 n xixixixi i=1 n xiyixiyixiyixiyi Note that these _______depend only on observed data and can be found with simple arithmetic ___________or automatically using Excel’s predefined___________.
17
Using Excel, click on Tools – Data Analysis … In the resulting dialog, choose Regression.
18
In the __________dialog, enter the Y-range and X-range. Choose to place the _______in a new worksheet called Results Select ___________and Normal Probability Plots to be created along with the output.
19
Click OK to produce the following results: Note that a (Intercept) and b (X Variable 1) are reported as 57.104 and 0.92997, respectively.
20
To add the resulting ____________line, first click on the worksheet Chart 1 which contains the original_____________. Next, click on the ____________so that they are highlighted and then choose Add Trendline … from the Chart pull-down menu.
21
Choose Linear Trend in the resulting dialog and click OK.
22
A linear trend is fit to the data:
23
One of the other __________output values that is given in Excel is: R Square = 69.4% This is a “_________” measure which represents the R 2 statistic discussed in introductory statistics classes. R 2 ranges in value from __________and gives an indication of how much of the total ________in Y from its mean is explained by the new trend line. In fact, there are three different sums of errors: TSS (________Sum of Squares) ESS (________Sum of Squares) RSS (________Sum of Squares)
24
The basic relationship between them is: TSS = ESS + RSS They are defined as follows: TSS = i=1n (Y i – Y ) 2 – ESS = i=1n (Y i – Y i ) 2 ^ i=1n (Y i – Y ) 2 ^– RSS = Essentially, the ____is the amount of variation that can’t be explained by the___________. The ____quantity is effectively the amount of the ________, total variation (TSS) that could be removed using the regression line.
25
If the regression line fits________, then ESS = 0 and RSS = TSS, resulting in R 2 = 1. R2 =R2 =R2 =R2 =RSSTSS R 2 is defined as: In this example, R 2 =.694 which means that approximately 70% of the variation in the Y values is explained by the one ____________ variable (X), cars per hour.
26
Now, returning to the original question: Should we build a station at Buffalo Grove where traffic is 183 cars/hour? The best guess at what the corresponding _____ volume would be is found by placing this X value into the new ____________equation: Sales/hour = 57.104 + 0.92997 * (183 cars/hour) However, it would be nice to be able to state a _____confidence interval around this best guess. y = a + b * x ^ = $227.29
27
Excel reports that the ______________(S e ) is 44.18. This quantity represents the amount of ______in the actual data around the regression line. We can get the information to do this from Excel’s Summary Output. The formula for S e is: Se =Se =Se =Se = i=1n (Y i – Y i ) 2 ^ n – k -1 Where n is the number of data points (e.g., 5) and k is the number of ___________variables (e.g., 1).
28
This equation is __________to: n – k -1 ESS Once we know S e and based on the ______ distribution, we can state that We have 68% confidence that the _____ value of sales/hour is within + 1 S e of the predicted value ($277.29).We have 68% confidence that the _____ value of sales/hour is within + 1 S e of the predicted value ($277.29). We have 95% confidence that the actual value of _____/hour is within + 2 S e of the predicted value ($277.29).We have 95% confidence that the actual value of _____/hour is within + 2 S e of the predicted value ($277.29). [277.29 – 2(44.18); 227.29 + 2(44.18)] [$138.93; $315.65] The 95% ___________interval is:
29
Another value of interest in the Summary report is the ____________for the X variable and its associated values. The t-statistic is 2.61 and the ________is 0.0798. A P-value less than 0.05 represents that we have at least 95% confidence that the ____parameter (b) is statistically significantly than 0 (zero). A slope of __results in a flat trend ______and indicates no relationship between Y and X. The 95% confidence limit for b is [-0.205; 2.064] Thus, we can’t _________the possibility that the true value of b might be 0.
30
Also given in the Summary report is the _____________. Since there is only ____ independent variable, the F –significance is identical to the P-value for the t-statistic. In the case of more than one X variable, the F – significance tests the ___________that all the X variable parameters as a group are statistically significantly different than zero.
31
Concerning multiple regression________, as you add other X variables, the R 2 statistic will always _______, meaning the RSS has increased. In this case, the _________ R 2 statistic is a reliable __________of the true goodness of fit because it compensates for the reduction in the ____due to the addition of more independent variables. Thus, it may report a _________adjusted R 2 value even though R 2 has increased, unless the improvement in ____is more than compensated for by the __________of the new independent variables.
32
Fitting a Quadratic Function The method of least ________can be used with any number of independent variables and with any _________ form (not just linear). Suppose that we wish to fit a _________function of the form y = a 0 + a 1 x + a 2 x 2 to the previous data with the method of least squares. The goal is to select a 0, a 1, and a 2 in order to __________the sum of squared deviations, which is now i=15 (y i – [a 0 + a 1 x i + a 2 x i 2 ]) 2
33
Proceed by setting the partial ____________with respect to a 0, a 1, and a 2 equal to______. This gives the equations 5a 0 + ( x i )a 1 + ( x i 2 )a 2 = y i ( x i )a 0 + ( x i 2 )a 1 + ( x i 3 )a 2 = x i y i ( x i 2 )a 0 + ( x i 3 )a 1 + ( x i 4 )a 2 = x i 2 y i This is a simple set of three linear equations in three__________. Thus, the general name for this least squares curve fitting is “___________________.” The term _________comes from the fact that simultaneous linear equations are being solved.
34
Solver will be used to find the coefficients in Excel. Consider the following worksheet:
35
Now, to find the ____________values for the parameters (a 0, a 1, and a 2 ) using________, first click on Tools – Solver.
36
In the resulting Solver Parameter dialog, specify the following settings: Click Solve to solve the____________, nonlinear optimization model. In this model, the objective function is to minimize the sum of_______________.
37
Here are the Solver results. The parameter values are: This formula calculates the sum of squared errors directly.
38
Use Excel’s Chart Wizard to plot the _______data and the resulting ___________function. First, highlight the original range of data, then click on the ______________button.
39
Use Excel’s Chart Wizard to plot the original data as a __________and specify a quadratic function via the Chart – Add Trendline … option.
40
Comparing the Linear and Quadratic Fits In the method of least squares, the _____of the squared deviations was selected as the measure of “______________.” Thus, the linear and quadratic fits can be compared with this___________. In order to make this comparison, go back to the linear regression “________” spreadsheet and make the corresponding calculation in the original “______” spreadsheet.
41
Note that the sum of the squared deviations for the ________function is indeed smaller than that for the ______function (i.e., 4954 < 5854.7). Indeed, the quadratic gives roughly a 15% __________in the sum of squared deviations. It follows then: the best quadratic function must be _______as good as the best linear function. A linear function is a special type of ________ function in which a 2 = 0.
42
If a quadratic function is at least as good as a linear function, why not choose a more ________ form, thereby getting an even better_____? WHICH CURVE TO FIT? In practice, _______of the form (with only a single independent variable for illustrative purposes) are often suggested: y = a 0 + a 1 x + a 2 x 2 + … + a n x n Such a function is called a _________of degree n, and it represents a broad and flexible class of functions. n = 2quadratic n = 3cubic n = 4_______ …
43
One must proceed with __________when fitting data with a ___________function. For example, it is possible to find a (k – 1)-degree polynomial that will _________fit k data points. To be more specific, suppose we have seven _________observations, denoted (x i, y i ), i = 1, 2, …, 7 It is possible to find a ____________polynomial y = a 0 + a 1 x + a 2 x 2 + … + a 6 x 6 that exactly passes through each of these seven data points.
44
A perfect fit gives ______for the sum of squared deviations. However, this is ________, for it does not imply much about the _________ value of the model for use in future forecasting.
45
Despite the ________of the polynomial function, the forecast is very_______. The linear fit might provide more __________forecasts. Also, note that the polynomial fit has __________ extrapolation properties (i.e., the polynomial “_________” at its extremes).
46
One way of finding which fit is truly “better” is to use a different standard of_____________, the “mean squared error” or MSE. MSE = sum of squared errors (# of points – # of parameters) For the___________, the number of parameters estimated is 2 (a, b) MSE = 5854(5-2) = 1951.3 MSE = 4954(5-3) = 2477.0 For the quadratic fit
47
So, the MSE gets ______in this case even though the total sum of squares will always be less or the same for a ___________fit. When there is a_________, both the total sum of squares and the MSE will be_____. Because of this, most forecasting programs will fit only up through a _____polynomial, since higher degrees don’t reflect the general trend of ______data.
48
What is a Good Fit? A good historical fit may have poor _______power. So what is a good fit? It depends on whether one has some idea about the _________real-world process that relates the y’s and x’s. To be an __________forecasting device, the forecasting function must to some extent capture important ________of that process. The more one knows, the _______one can do. However, knowledge of the underlying process is typically phrased in__________ language. For example, linear curve fitting, in the statistical context, is called______________.
49
If the statistical _____________about the linear regression model are precisely satisfied (e.g., errors are _________distributed around the regression line), then in a precise and well- defined sense, statisticians can prove that the linear fit is the “______possible fit.” In the real world one can never be completely certain about the ____________process. The question then becomes: How much ___________can we have that the underlying process is one that satisfies a particular set of statistical____________? Fortunately, statistical analysis can reveal how well the _________data do indeed satisfy those assumptions.
50
And if it does not satisfy the assumptions, then try a different________. Remember, there is an underlying real-world _________and the model is a selective ___________________of that problem. How good is that model? Ideally, to test the goodness of a model, one would like to have considerable ____________with its use. If, in repeated use, it is observed that the model performs well, then our confidence is________. However, what confidence can we have at the outset, without experience?
51
Validating Models One_________, is to ask the question: Suppose the model had been used to make past decisions; how well would the firm have fared? This approach “creates” experience by ________ the past. This is often referred to as _________of the model. Typically, one uses only a ______of the historical data to create the model – for example, to fit a polynomial of a specified degree. One can then use the remaining _____to see how well the model would have performed.
52
End of Part 1 Please continue to Part 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.