Download presentation
Presentation is loading. Please wait.
Published byCody Armstrong Modified over 9 years ago
1
1 1 ISyE 6203 Variability Basics John H. Vande Vate Fall 2011
2
2 2 Agenda Forecasting Assignable-Cause vs Common variation Review of Probability Review of Regression Forecasting
3
3 3 Forecasting is the effort to determine what we can about the future from the past. We will focus on Quantitative Methods –i.e., not opinions, judgments, “markets”, etc.
4
4 4 Laws of Forecasting Law 1: Forecasts are wrong Law 2: Forecast Demand not Sales Law 3: It is generally easier to forecast aggregate data than it is to forecast the details. (Big Idea) Law 4: It is generally easier to forecast a short time into the future that to forecast far into the future Law 5: Simpler forecasts are generally better forecasts
5
5 5 General Framework Future = (Past) + Residual Error The Specifics –What aspects of the past are relevant –What form of to use The Issues –Accuracy: Is the E[Residual Error] ~ 0? –Precision: Is Residual Error small? –Complexity and Cost!
6
6 6 Variability Residual Error is the “noise” or unpredictable variation Our focus is on managing this What tools are available for managing unpredictable variation?
7
7 7 Types of Variability Predictable or Assignable Cause Variations –Weekends and holidays –Major scheduled events, promotions –Seasonality & Growth –Other Causal relationships Unpredictable or Common Cause Variations –Inherent randomness
8
8 8 Types of Variability Predictable or Assignable Cause Variations Unpredictable or Common Cause Variations Avoid unnecessary variability Trade-off of capturing predictable variation while managing complexity
9
9 9 Review of Probability Random Variable: A function that associates a value with each point in a sample space. Two Random Variables X and Y are independent if and only if for any two subsets A and B of their sample spaces, P(X A and Y B) = P(X A)P(Y B) Mean: E[X] = Variance: E[(X-E[X])(X-E[X])]
10
10 Review of Probability Covariance: Cov(X,Y) = E[[X-E[X]][Y-E[Y]] If X and Y are independent, Cov(X,Y) = 0 Correlation Coefficient XY ranges between -1 and 1 If XY = 0, X and Y are uncorrelated If X and Y are independent, they are uncorrelated. Uncorrelated random variables need not be independent. More details at http://en.wikipedia.org/wiki/Correlation_and_depende nce http://en.wikipedia.org/wiki/Correlation_and_depende nce
11
11 Uncorrelated vs Independent If X and Y are independent, they are uncorrelated. Uncorrelated random variables need not be independent. Example: –X is Uniformly distributed on [-1,1] –Y is X 2 E[X] = 0 E[Y] =
12
12 Uncorrelated vs Independent If X and Y are independent, they are uncorrelated. Uncorrelated random variables need not be independent. Example: –X is Uniformly distributed on [-1,1] –Y is X 2 Cov(X,Y) =
13
13 Covariance and Variance Constant α E[αX] = Var[α X] = 2 α X = Stdev[α X] = α X = E[X + Y] = Var[X + Y] = 2 X+Y = Stdev[X + Y] = X+Y = αE[X] α22Xα22X αXαX E[X]+E[Y] 2 X + 2 Y + 2*Cov(X,Y)
14
14 The Big Idea When combining random variables, some of the “noise” cancels. How much cancels depends on the correlation.
15
15 Stdev[X + Y] Var[X+Y] = If So No reduction in variation Reductions in variation
16
16 The Big Idea When combining random variables, some of the “noise” cancels. How much cancels depends on the correlation.
17
17 Noise Canceling X 1, X 2 independent, identically distributed rvs Var[2X 1 ] = 4 2 X Stdev[2X 1 ] = 2 X Var[X 1 + X 2 ] = Stdev[X 1 + X 2 ] = About 30% of the variability canceled 2 2 X + 2*Cov(X,X)
18
18 The Big Idea? That’s a “Big Idea”? So what?! Where’s the beef?
19
19 Exam Review Rules of the game: –I probably made mistakes grading. –If you think I made mistakes grading your exam, write a brief note indicating what you would like me to review, turn it in. I will review it.
20
20 Results Average ~71 Std Dev ~18 ABC
21
21 Question 1
22
22 Question 1
23
23 Question 2 Where? –Indianapolis Why? –Co-locating the cross dock and the plant eliminates $1.2 million in cycle inventory of monitors saving us $222 thousand annually –Other reasons What I wanted you to see –We have a different model of cost if the cross dock is at a plant –So check those locations separately
24
24 Question 3 Admittedly challenging. To simplify discussion assume the context in which the number and locations of cross docks and pools is fixed. We’ll address the case where we can close cross docks and pools subsequently
25
25 What we have to do Manage flows of components to the cross docks – that’s no longer the same in every solution Balance the flows of components into and products out of the cross docks Manage the flows of finished goods from the cross docks to the pools Balance the flows of each product at each pool Manage the assignments of stores to pools Manage the single sourcing of products to the pools Account for trucks between the cross docks and the pools Manage the frequency requirements (for each product) to the pools
26
26 Flows of Components We can identify the components and the plants CompFlow(i,j) to represent the units of component made a plant i shipped to cross dock j each week. Non-negative Multiply by $1*Distance between plant i and cross dock j*weight of product made at plant i and divide by 30,000 lbs to get trucking cost to the cross dock.
27
27 Balance at Cross Docks We have to balance the flow of each component i to each cross dock j with the volumes of finished products assembled there. Let Assemble(p,j) be the units of finshed product p assembled at cross dock j each week. Non-negative Let Recipe(p,i) be the number of components from plant i in a unit of finished product p How to express the balance?
28
28 Balance Assemble(p,j)*Recipe(p,i) = units of component made at plant i needed to produce finished product p at cross dock j. Sum over the finished products to get the total units of component made at plant i needed at cross dock j
29
29 Flow of Finished Goods Let’s call the units of finished product p from cross dock j to pool k each week, ProdToPool(p, j, k) We want all the assembled product p at cross dock j to flow out to pools
30
30 Balance Flow at Pools Let’s use Assign(s,k) to indicate whether or not store s is assigned to pool k The demand for product p at each store each week is StoreDemand(p) The demand for product p at pool k implied by the assignments is The total units of product p available to meet this demand at pool k is
31
31 Balance Flow at Pools These should balance for each product at each pool
32
32 Assignments of Stores to Pools Assign each store s to one and only one pool So far everything is pretty standard.
33
33 Manage Single Sourcing at Pools Each pool k should get all of each product p from one and only one cross dock Let Source(p, j, k) indicate whether or not pool k sources product p from cross dock j Single Sourcing
34
34 But it’s a bit more complicated We don’t know the demand at the pool except as a function of the Assign variables so we need to shut down ProdToPool variables based on the Source decisions. To do this we need an upper bound on ProdToPool. Let TotalDemand(p) be the total demand each week for product p across all stores For each product p, cross dock j and pool k:
35
35 Trucks from Cross Dock to Pool Since there is a frequency requirement for serving the pools, we must keep track of the trucks we send each week We know the weight moving from each cross dock j to each pool k each week: So,
36
36 Frequency requirement At least one truck to pool k from each cross dock j for each cross dock that pool k sources SOME product from For each pool k, product p and cross dock j
37
37 If we can shut down cross docks? Let OpenCD(j) indicate whether or not we will open cross dock j Clearly we want for each product p, pool k and cross dock j
38
38 Alternate & Better model Integrate the sourcing decisions into a binary decision variable Path(prod, cd, pool, store) = 1 if store receives prod from pool and pool receives prod from cd Keep Assign(store, pool) = 1 if store receives supplies from pool
39
39 Alternate & Better model Single sourcing at Store s Keep Assign(store, pool) = 1 if store receives supplies from pool Ensure Path choices are consistent with assignments, i.e., we get to store via assigned pools for all products
40
40 Alternate & Better Model Ensure single sourcing at pool Introduce Source(prod, cd, pool) = 1 if pool receives prod from cd Ensure consistency with Path decisions
41
41 Alternate & Better Model Volumes at the pool as before – based on Assign Volumes at the Cross Dock based on Path Is the requirement for comp at cd …
42
42 Examples Adding across customers or geographies Demand for a single sku at a single DC
43
43 Daily Sales: Single SKU Sum of Stdev’s of all DCs: 1,957 Stdev of Demand over all DCs: 1,553 (21% less)
44
44 Correlated? If daily sales were uncorrelated across all DCs then variances would add Stdev across all DCs would equal Sqrt(Var[DC-1]+Var[DC-2]+Var[…]) = 1,150 Actual Stdev is 1,553 Conclusion?
45
45 Adding across Products Sum of Stdev in Weekly Sales across all SKUs for vendor at a DC: 1,043 By far the largest SKU has Stdev 999. Stdev in Total Weekly Sales for DC of all SKUs from vendor: 996 Explain
46
46 Adding across time Stdev in daily sales of SKU: 999.49 Stdev in weekly sales of SKU: 3882.63 Much lower than 5*Daily Stdev Higher than ?*Daily Stdev Conclusion?
47
47 How can we: Add across customers? Add across products? Add across time? When do these conflict?
48
48 Questions?
49
49 Agenda Assignable-Cause vs Common variation Review of Probability Review of Regression: A tool for capturing predictable variation Forecasting
50
50 Correlation Example Truck load shipments from Green Bay and Denver to Indianapolis Assemble Products in Indianapolis and distribute by full truckload from there to stores What will happen to costs compared to direct full truck load shipments? –Transportation –Pipeline –At plants –At Indianapolis Warehouse/Cross Dock –At Stores
51
51 Regression Explain or model the relationship between the dependent variable (e.g., tree height) and the independent variables (e.g., trunk diameter) Linear Regression Model y = β 0 + β 1 x + Assignable cause variation Common cause variation
52
52 Regression
53
53 Regression The model y = 4.5413 x - 1.3147 is the “best fit” linear model for the relationship It is not based on physical laws or causality (e.g., thin trees don’t have negative height) It does “explain” about 78% of the variability in the data R 2 = 0.7854
54
54 Explained vs Unexplained Variation For Linear Regression Coefficient of Determination
55
55 Coefficient of Determination 0 ≤ R 2 ≤ 1 Note in our example the Coefficient of Determination R 2 is equal to the square of the Correlation Coefficient r 2 R 2 = 0.7854 = r 2 = 0.886 2 This is generally the case for simple linear regression (1 independent variable)
56
56 Excel Output R2R2 β0β0 β1β1 Explained Sum of Squares Residual Sum of Squares Sqrt of normalized Residual Sum of Squares Normalized (by the degrees of freedom) Residual Sum of Squares
57
57 Standard Error A sort of standard deviation about the regression line. How widely dispersed are observations about the regression line
58
58 P-values Indicates how likely it is to see this value if the true value of the coefficient is 0 and there’s as much noise as we see in the data. Strong evidence there’s a relationship Weak evidence the intercept isn’t 0
59
59 Multiple Linear Regression With more than one independent variable, e.g., Sales = β 0 + β 1 GDP + β 2 Unemployment Rate + … Need to watch out for –Non-linearity: The relationship might not be linear, e.g., weight of the tree vs trunk diameter. –Multi-colinearity: one independent variable is a linear function of another (eliminate one) –Over specified model: Adding more independent variables increases R 2, but reduces the degrees of freedom in the fit. Adjusted R 2 attempts to account for this.
60
60 Static Regression Salaries Independent Data
61
61 Excel’s Linest Array Function Linest(Y-array, Array of X’s, [const],[stat]) outputs the β’s One column for each β Remember Array Functions are entered with Ctrl-Shift-Enter Allows you to perform running regressions Coefficients come out in reverse order (Go figure)
62
62 LinEst Regression.xls
63
63 Questions?
64
64 LTL Rates Complicated to work with rate engines Alternative: model the rates Challenge: –Model for rates = Max{Min Charge, Min{Intercept1 + Rate1*Weight*Distance, Intercept2 + Rate2*Weight*Distance}} Come up with estimates for –Min Charge, Intercepts, Rates Winner: Minimum sum of square errors
65
65 LTL Rates
66
66 Agenda Assignable-Cause vs Common variation Review of Probability Review of Regression Forecasting
67
67 Forecasting Forecasting is the effort to determine what we can about the future from the past. We will focus on Quantitative Methods –i.e., not opinions, judgments, “markets”, etc.
68
68 General Framework Future = (Past) + Residual Error The Specifics –What aspects of the past are relevant –What form of to use The Issues –Accuracy: Is the E[Residual Error] ~ 0? –Precision: Is Residual Error small? –Complexity and Cost!
69
69 Examples Autoregressive or time-series –Past = Historical values of the process we are forecasting, e.g., past demand forecasts future demand Causal –Past = Historical values are “leading indicators” like GDP, employment, housing starts, etc. Regression, Maximum Likelihood –Past may include both historical values and leading indicators
70
70 Past & Future We lump data into time periods –Average, total or sample in some period –Reduces data requirements –Averaging and totaling smooth the data (remember the Big Idea) –Actionable What we’re forecasting –y t = value at time t Past –Autoregressive: y t-1, y t-2, y t-3,… –Leading indicators: x i,t-1, x i,t-2, x i,t-3,…
71
71 Specifics Autoregressive or Time-Series –Moving average Past = past n observations y t-1, y t-2, y t-3,… y t-n (y t-1, y t-2, y t-3,… y t-n ) = More a tool for understanding the past than for forecasting the future –Exponential Smoothing Past = “all” past observations y t-1, y t-2, y t-3,… (y t-1, y t-2, y t-3,… ) =
72
72 Specifics Exponential Smoothing with Trend –Past = “all” past observations y t-1, y t-2, y t-3,… –Forecast uses exponential smoothing to estimate The “Level”: weighted average of observation and past estimate The “Trend”: weighted average of observation and past estimate Forecast m periods in the future = Level + m*Trend –Details at Engineering Statistics HandbookEngineering Statistics Handbook
73
73 Specifics Exponential Smoothing with Trend & Seasonality –Past = “all” past observations y t-1, y t-2, y t-3,… –Forecast uses exponential smoothing to estimate The “(De-seasonalized) Level”: weighted average of the de- seasonalized observation and past de-seasonalized estimate The “Trend”: weighted average of observation and past estimate The “Seasonal factors”: weighted average of observation and past estimate Forecast m periods in the future = (Level+m*Trend)*Seasonal Factor –Details at Engineering Statistics HandbookEngineering Statistics Handbook
74
74 Specifics Exponential Smoothing with … –You get the idea. –Issues Initialization data Choosing the weights Growing complexity
75
75 Specifics Regression, Maximum Likelihood –Past = past observations y t-1, y t-2, y t-3,… and leading indicators x i,t-1, x i,t-2 … – (y t-1, y t-2, y t-3,… x i,t-1, x i,t-2 …) is some function of these past values Examples: –Linear –Non-linear models »Diffusion »Logit »Probit
76
76 Bass Diffusion Postulates a form for sales over the life of a product Three parameters –m: The total potential market –p, q: Shape parameters
77
77 Questions?
78
78 Top Down vs Bottom Up Often faced with forecasting 100’s of families and 1,000’s of SKUs Different Options: –Top Down: Develop an aggregate forecast and allocate it to more detailed level –Bottom Up: Develop individual detailed forecasts and aggregate them up
79
79 Laws of Forecasting Law 1: Forecasts are wrong Law 2: Forecast Demand not Sales Law 3: It is generally easier to forecast aggregate data than it is to forecast the details. (Big Idea) Law 4: It is generally easier to forecast a short time into the future that to forecast far into the future Law 5: Simpler forecasts are generally better forecasts
80
80 Examples from ABL Sales for SKU 8295 Raw data
81
81 Variability? How do we forecast this? How do we assign a variability to this? Not Actionable!
82
82 Sales by Day We can make ordering decisions on a daily basis:
83
83 Weekly Sales Or on a weekly basis
84
84 Or a Monthly Basis Or on a monthly basis Which is appropriate?
85
85 Compare the Variability.
86
86 The Big Idea Average Daily Sales: 1280.196 Std Dev. In Daily Sales: 1546.472 Average Weekly Sales: 6400.981 Std Dev. In Weekly Sales: 5971.578 Avg Weekly Sales = 5*Average Daily Sales What about the relationship between the variabilities? 5*Std Dev. In Daily Sales = 7732.361 What does the Big Idea say we should expect?
87
87 The Big Idea If sales from week to week are independent, we should see 5*Variance in Daily Sales = Variance in Weekly Sales Sqrt(5)*Std Dev in Daily Sales = Std Dev in Weekly Sales Sqrt(5)*Std Dev in Daily Sales = 3458.017 < 5971.578 So, Sales from Day to Day are (auto) correlated
88
88 Forecasting Try Simple Techniques: –Moving Average –Exponential Smoothing –Exponential Smoothing with Trend –Exponential Smoothing with Trend & Seasonality –(Auto) Regression –Bass Diffusion
89
89 Moving Average Average Error: 666.29 (Over Estimates) Std Dev of Error: 4,094 (< Std Dev. In Sales)
90
90 Exponential Smoothing Average Error: 786.13 (less accurate) Std.Dev in Errors: 4,308 (less precise)
91
91 Exponential Smoothing w/ Trend Average Error: -167.58 (more accurate) Std.Dev in Errors: 5,065 (less precise)
92
92 Exp. Smoothing w/ Trend & Seasonality Seasonality by: Day of Week Week of Month Week of Year
93
93 Regression Using Previous 2 week Previous 3 weeks … Might Use Week of Month too
94
94 Previous 2 Weeks Average Error: 2,096 Std Dev in Error: 4,929
95
95 Previous 3 Weeks Average Error: 1,837 Std Dev in Error: 5,010
96
96 Previous 4 Weeks Average Error: 1664 Std Dev in Error: 5367.29
97
97 Bass Diffusion Average Error: -1536.17 Std Dev in Error: 5324.07
98
98 Conclusion Simple moving average provides best forecast on a weekly basis Exponential smoothing better on a monthly basis Now let’s build a demand distribution given a forecast Idea: Compare Actual Sales to Forecasted Sales through the ratio Accuracy means Average should be 1 Precision means Std Dev should be small
99
99 Demand Distribution We know the forecast is WRONG But it does give us some information What Actual Sales will be is uncertain, but we can develop a distribution for it What are the chances Actual Sales are larger than X? Smaller than Y? …
100
100 Actual to Forecast Ratios the Avg is 1.1 (What does that mean?) σ the Std Dev is 0.87 Ratio < 1 Over forecast Ratio > 1 Under forecast
101
101 Translate Forecast into Demand Distribution Assuming things continue as they have… If the forecast is 100, what do we expect actual sales to be? So, if is the Average Actual/Forecast ratio & F is the forecast, Expected demand is F What is the spread of actual demand about this mean value? 110
102
102 Translate Forecast into Demand Distribution If things continue as they have, it is natural to assume we will draw from the historical distribution of Actual/Forecast ratios. So, if = 1, If the forecast is 100, what should the distribution of Actual Sales be? –The mean should be? –The std dev of actual demand about this mean value should be? Why? 100 100 σ
103
103 Demand Distribution The Actual/Forecast ratio has a distribution with mean and std dev σ If = 1, the distribution for Actual Sales is just F, the forecast, times the Actual/Forecast ratio so, it has mean F and std dev σF If ≠ 1, the distribution for Actual Sales is just F, the forecast, times the Actual/Forecast ratio so, it has mean F and std dev σF
104
104 Common Cause Variability So, the common cause variability in demand for SKU 8295 that we will need to protect against with safety stock is about 87% of the Forecasted demand! Just working with Raw Sales –Std Dev/Average = 93%
105
105 Common Cause Variability On the other hand, with monthly sales Working with Raw Sales –Std Dev/Average = 74% Working with Exponential Smoothing forecast –Std Dev/Average = 36%
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.