Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 1 ISyE 6203 Variability Basics John H. Vande Vate Fall 2011.

Similar presentations


Presentation on theme: "1 1 ISyE 6203 Variability Basics John H. Vande Vate Fall 2011."— Presentation transcript:

1 1 1 ISyE 6203 Variability Basics John H. Vande Vate Fall 2011

2 2 2 Agenda Forecasting Assignable-Cause vs Common variation Review of Probability Review of Regression Forecasting

3 3 3 Forecasting is the effort to determine what we can about the future from the past. We will focus on Quantitative Methods –i.e., not opinions, judgments, “markets”, etc.

4 4 4 Laws of Forecasting Law 1: Forecasts are wrong Law 2: Forecast Demand not Sales Law 3: It is generally easier to forecast aggregate data than it is to forecast the details. (Big Idea) Law 4: It is generally easier to forecast a short time into the future that to forecast far into the future Law 5: Simpler forecasts are generally better forecasts

5 5 5 General Framework Future =  (Past) + Residual Error The Specifics –What aspects of the past are relevant –What form of  to use The Issues –Accuracy: Is the E[Residual Error] ~ 0? –Precision: Is  Residual Error small? –Complexity and Cost!

6 6 6 Variability  Residual Error is the “noise” or unpredictable variation Our focus is on managing this What tools are available for managing unpredictable variation?

7 7 7 Types of Variability Predictable or Assignable Cause Variations –Weekends and holidays –Major scheduled events, promotions –Seasonality & Growth –Other Causal relationships Unpredictable or Common Cause Variations –Inherent randomness

8 8 8 Types of Variability Predictable or Assignable Cause Variations Unpredictable or Common Cause Variations Avoid unnecessary variability Trade-off of capturing predictable variation while managing complexity

9 9 9 Review of Probability Random Variable: A function that associates a value with each point in a sample space. Two Random Variables X and Y are independent if and only if for any two subsets A and B of their sample spaces, P(X  A and Y  B) = P(X  A)P(Y  B) Mean: E[X] = Variance: E[(X-E[X])(X-E[X])]

10 10 Review of Probability Covariance: Cov(X,Y) = E[[X-E[X]][Y-E[Y]] If X and Y are independent, Cov(X,Y) = 0 Correlation Coefficient  XY ranges between -1 and 1 If  XY = 0, X and Y are uncorrelated If X and Y are independent, they are uncorrelated. Uncorrelated random variables need not be independent. More details at http://en.wikipedia.org/wiki/Correlation_and_depende nce http://en.wikipedia.org/wiki/Correlation_and_depende nce

11 11 Uncorrelated vs Independent If X and Y are independent, they are uncorrelated. Uncorrelated random variables need not be independent. Example: –X is Uniformly distributed on [-1,1] –Y is X 2 E[X] = 0 E[Y] =

12 12 Uncorrelated vs Independent If X and Y are independent, they are uncorrelated. Uncorrelated random variables need not be independent. Example: –X is Uniformly distributed on [-1,1] –Y is X 2 Cov(X,Y) =

13 13 Covariance and Variance Constant α E[αX] = Var[α X] =  2 α X = Stdev[α X] =  α X = E[X + Y] = Var[X + Y] =  2 X+Y = Stdev[X + Y] =  X+Y = αE[X] α22Xα22X αXαX E[X]+E[Y]  2 X +  2 Y + 2*Cov(X,Y)

14 14 The Big Idea When combining random variables, some of the “noise” cancels. How much cancels depends on the correlation.

15 15 Stdev[X + Y] Var[X+Y] = If So No reduction in variation Reductions in variation

16 16 The Big Idea When combining random variables, some of the “noise” cancels. How much cancels depends on the correlation.

17 17 Noise Canceling X 1, X 2 independent, identically distributed rvs Var[2X 1 ] = 4  2 X Stdev[2X 1 ] = 2  X Var[X 1 + X 2 ] = Stdev[X 1 + X 2 ] = About 30% of the variability canceled 2  2 X + 2*Cov(X,X)

18 18 The Big Idea? That’s a “Big Idea”? So what?! Where’s the beef?

19 19 Exam Review Rules of the game: –I probably made mistakes grading. –If you think I made mistakes grading your exam, write a brief note indicating what you would like me to review, turn it in. I will review it.

20 20 Results Average ~71 Std Dev ~18 ABC

21 21 Question 1

22 22 Question 1

23 23 Question 2 Where? –Indianapolis Why? –Co-locating the cross dock and the plant eliminates $1.2 million in cycle inventory of monitors saving us $222 thousand annually –Other reasons What I wanted you to see –We have a different model of cost if the cross dock is at a plant –So check those locations separately

24 24 Question 3 Admittedly challenging. To simplify discussion assume the context in which the number and locations of cross docks and pools is fixed. We’ll address the case where we can close cross docks and pools subsequently

25 25 What we have to do Manage flows of components to the cross docks – that’s no longer the same in every solution Balance the flows of components into and products out of the cross docks Manage the flows of finished goods from the cross docks to the pools Balance the flows of each product at each pool Manage the assignments of stores to pools Manage the single sourcing of products to the pools Account for trucks between the cross docks and the pools Manage the frequency requirements (for each product) to the pools

26 26 Flows of Components We can identify the components and the plants CompFlow(i,j) to represent the units of component made a plant i shipped to cross dock j each week. Non-negative Multiply by $1*Distance between plant i and cross dock j*weight of product made at plant i and divide by 30,000 lbs to get trucking cost to the cross dock.

27 27 Balance at Cross Docks We have to balance the flow of each component i to each cross dock j with the volumes of finished products assembled there. Let Assemble(p,j) be the units of finshed product p assembled at cross dock j each week. Non-negative Let Recipe(p,i) be the number of components from plant i in a unit of finished product p How to express the balance?

28 28 Balance Assemble(p,j)*Recipe(p,i) = units of component made at plant i needed to produce finished product p at cross dock j. Sum over the finished products to get the total units of component made at plant i needed at cross dock j

29 29 Flow of Finished Goods Let’s call the units of finished product p from cross dock j to pool k each week, ProdToPool(p, j, k) We want all the assembled product p at cross dock j to flow out to pools

30 30 Balance Flow at Pools Let’s use Assign(s,k) to indicate whether or not store s is assigned to pool k The demand for product p at each store each week is StoreDemand(p) The demand for product p at pool k implied by the assignments is The total units of product p available to meet this demand at pool k is

31 31 Balance Flow at Pools These should balance for each product at each pool

32 32 Assignments of Stores to Pools Assign each store s to one and only one pool So far everything is pretty standard.

33 33 Manage Single Sourcing at Pools Each pool k should get all of each product p from one and only one cross dock Let Source(p, j, k) indicate whether or not pool k sources product p from cross dock j Single Sourcing

34 34 But it’s a bit more complicated We don’t know the demand at the pool except as a function of the Assign variables so we need to shut down ProdToPool variables based on the Source decisions. To do this we need an upper bound on ProdToPool. Let TotalDemand(p) be the total demand each week for product p across all stores For each product p, cross dock j and pool k:

35 35 Trucks from Cross Dock to Pool Since there is a frequency requirement for serving the pools, we must keep track of the trucks we send each week We know the weight moving from each cross dock j to each pool k each week: So,

36 36 Frequency requirement At least one truck to pool k from each cross dock j for each cross dock that pool k sources SOME product from For each pool k, product p and cross dock j

37 37 If we can shut down cross docks? Let OpenCD(j) indicate whether or not we will open cross dock j Clearly we want for each product p, pool k and cross dock j

38 38 Alternate & Better model Integrate the sourcing decisions into a binary decision variable Path(prod, cd, pool, store) = 1 if store receives prod from pool and pool receives prod from cd Keep Assign(store, pool) = 1 if store receives supplies from pool

39 39 Alternate & Better model Single sourcing at Store s Keep Assign(store, pool) = 1 if store receives supplies from pool Ensure Path choices are consistent with assignments, i.e., we get to store via assigned pools for all products

40 40 Alternate & Better Model Ensure single sourcing at pool Introduce Source(prod, cd, pool) = 1 if pool receives prod from cd Ensure consistency with Path decisions

41 41 Alternate & Better Model Volumes at the pool as before – based on Assign Volumes at the Cross Dock based on Path Is the requirement for comp at cd …

42 42 Examples Adding across customers or geographies Demand for a single sku at a single DC

43 43 Daily Sales: Single SKU Sum of Stdev’s of all DCs: 1,957 Stdev of Demand over all DCs: 1,553 (21% less)

44 44 Correlated? If daily sales were uncorrelated across all DCs then variances would add Stdev across all DCs would equal Sqrt(Var[DC-1]+Var[DC-2]+Var[…]) = 1,150 Actual Stdev is 1,553 Conclusion?

45 45 Adding across Products Sum of Stdev in Weekly Sales across all SKUs for vendor at a DC: 1,043 By far the largest SKU has Stdev 999. Stdev in Total Weekly Sales for DC of all SKUs from vendor: 996 Explain

46 46 Adding across time Stdev in daily sales of SKU: 999.49 Stdev in weekly sales of SKU: 3882.63 Much lower than 5*Daily Stdev Higher than ?*Daily Stdev Conclusion?

47 47 How can we: Add across customers? Add across products? Add across time? When do these conflict?

48 48 Questions?

49 49 Agenda Assignable-Cause vs Common variation Review of Probability Review of Regression: A tool for capturing predictable variation Forecasting

50 50 Correlation Example Truck load shipments from Green Bay and Denver to Indianapolis Assemble Products in Indianapolis and distribute by full truckload from there to stores What will happen to costs compared to direct full truck load shipments? –Transportation –Pipeline –At plants –At Indianapolis Warehouse/Cross Dock –At Stores

51 51 Regression Explain or model the relationship between the dependent variable (e.g., tree height) and the independent variables (e.g., trunk diameter) Linear Regression Model y = β 0 + β 1 x +  Assignable cause variation Common cause variation

52 52 Regression

53 53 Regression The model y = 4.5413 x - 1.3147 is the “best fit” linear model for the relationship It is not based on physical laws or causality (e.g., thin trees don’t have negative height) It does “explain” about 78% of the variability in the data R 2 = 0.7854

54 54 Explained vs Unexplained Variation For Linear Regression Coefficient of Determination

55 55 Coefficient of Determination 0 ≤ R 2 ≤ 1 Note in our example the Coefficient of Determination R 2 is equal to the square of the Correlation Coefficient r 2 R 2 = 0.7854 = r 2 = 0.886 2 This is generally the case for simple linear regression (1 independent variable)

56 56 Excel Output R2R2 β0β0 β1β1 Explained Sum of Squares Residual Sum of Squares Sqrt of normalized Residual Sum of Squares Normalized (by the degrees of freedom) Residual Sum of Squares

57 57 Standard Error A sort of standard deviation about the regression line. How widely dispersed are observations about the regression line

58 58 P-values Indicates how likely it is to see this value if the true value of the coefficient is 0 and there’s as much noise as we see in the data. Strong evidence there’s a relationship Weak evidence the intercept isn’t 0

59 59 Multiple Linear Regression With more than one independent variable, e.g., Sales = β 0 + β 1 GDP + β 2 Unemployment Rate + … Need to watch out for –Non-linearity: The relationship might not be linear, e.g., weight of the tree vs trunk diameter. –Multi-colinearity: one independent variable is a linear function of another (eliminate one) –Over specified model: Adding more independent variables increases R 2, but reduces the degrees of freedom in the fit. Adjusted R 2 attempts to account for this.

60 60 Static Regression Salaries Independent Data

61 61 Excel’s Linest Array Function Linest(Y-array, Array of X’s, [const],[stat]) outputs the β’s One column for each β Remember Array Functions are entered with Ctrl-Shift-Enter Allows you to perform running regressions Coefficients come out in reverse order (Go figure)

62 62 LinEst Regression.xls

63 63 Questions?

64 64 LTL Rates Complicated to work with rate engines Alternative: model the rates Challenge: –Model for rates = Max{Min Charge, Min{Intercept1 + Rate1*Weight*Distance, Intercept2 + Rate2*Weight*Distance}} Come up with estimates for –Min Charge, Intercepts, Rates Winner: Minimum sum of square errors

65 65 LTL Rates

66 66 Agenda Assignable-Cause vs Common variation Review of Probability Review of Regression Forecasting

67 67 Forecasting Forecasting is the effort to determine what we can about the future from the past. We will focus on Quantitative Methods –i.e., not opinions, judgments, “markets”, etc.

68 68 General Framework Future =  (Past) + Residual Error The Specifics –What aspects of the past are relevant –What form of  to use The Issues –Accuracy: Is the E[Residual Error] ~ 0? –Precision: Is  Residual Error small? –Complexity and Cost!

69 69 Examples Autoregressive or time-series –Past = Historical values of the process we are forecasting, e.g., past demand forecasts future demand Causal –Past = Historical values are “leading indicators” like GDP, employment, housing starts, etc. Regression, Maximum Likelihood –Past may include both historical values and leading indicators

70 70 Past & Future We lump data into time periods –Average, total or sample in some period –Reduces data requirements –Averaging and totaling smooth the data (remember the Big Idea) –Actionable What we’re forecasting –y t = value at time t Past –Autoregressive: y t-1, y t-2, y t-3,… –Leading indicators: x i,t-1, x i,t-2, x i,t-3,…

71 71 Specifics Autoregressive or Time-Series –Moving average Past = past n observations y t-1, y t-2, y t-3,… y t-n  (y t-1, y t-2, y t-3,… y t-n ) = More a tool for understanding the past than for forecasting the future –Exponential Smoothing Past = “all” past observations y t-1, y t-2, y t-3,…  (y t-1, y t-2, y t-3,… ) =

72 72 Specifics Exponential Smoothing with Trend –Past = “all” past observations y t-1, y t-2, y t-3,… –Forecast uses exponential smoothing to estimate The “Level”: weighted average of observation and past estimate The “Trend”: weighted average of observation and past estimate Forecast m periods in the future = Level + m*Trend –Details at Engineering Statistics HandbookEngineering Statistics Handbook

73 73 Specifics Exponential Smoothing with Trend & Seasonality –Past = “all” past observations y t-1, y t-2, y t-3,… –Forecast uses exponential smoothing to estimate The “(De-seasonalized) Level”: weighted average of the de- seasonalized observation and past de-seasonalized estimate The “Trend”: weighted average of observation and past estimate The “Seasonal factors”: weighted average of observation and past estimate Forecast m periods in the future = (Level+m*Trend)*Seasonal Factor –Details at Engineering Statistics HandbookEngineering Statistics Handbook

74 74 Specifics Exponential Smoothing with … –You get the idea. –Issues Initialization data Choosing the weights Growing complexity

75 75 Specifics Regression, Maximum Likelihood –Past = past observations y t-1, y t-2, y t-3,… and leading indicators x i,t-1, x i,t-2 … –  (y t-1, y t-2, y t-3,… x i,t-1, x i,t-2 …) is some function of these past values Examples: –Linear –Non-linear models »Diffusion »Logit »Probit

76 76 Bass Diffusion Postulates a form for sales over the life of a product Three parameters –m: The total potential market –p, q: Shape parameters

77 77 Questions?

78 78 Top Down vs Bottom Up Often faced with forecasting 100’s of families and 1,000’s of SKUs Different Options: –Top Down: Develop an aggregate forecast and allocate it to more detailed level –Bottom Up: Develop individual detailed forecasts and aggregate them up

79 79 Laws of Forecasting Law 1: Forecasts are wrong Law 2: Forecast Demand not Sales Law 3: It is generally easier to forecast aggregate data than it is to forecast the details. (Big Idea) Law 4: It is generally easier to forecast a short time into the future that to forecast far into the future Law 5: Simpler forecasts are generally better forecasts

80 80 Examples from ABL Sales for SKU 8295 Raw data

81 81 Variability? How do we forecast this? How do we assign a variability to this? Not Actionable!

82 82 Sales by Day We can make ordering decisions on a daily basis:

83 83 Weekly Sales Or on a weekly basis

84 84 Or a Monthly Basis Or on a monthly basis Which is appropriate?

85 85 Compare the Variability.

86 86 The Big Idea Average Daily Sales: 1280.196 Std Dev. In Daily Sales: 1546.472 Average Weekly Sales: 6400.981 Std Dev. In Weekly Sales: 5971.578 Avg Weekly Sales = 5*Average Daily Sales What about the relationship between the variabilities? 5*Std Dev. In Daily Sales = 7732.361 What does the Big Idea say we should expect?

87 87 The Big Idea If sales from week to week are independent, we should see 5*Variance in Daily Sales = Variance in Weekly Sales Sqrt(5)*Std Dev in Daily Sales = Std Dev in Weekly Sales Sqrt(5)*Std Dev in Daily Sales = 3458.017 < 5971.578 So, Sales from Day to Day are (auto) correlated

88 88 Forecasting Try Simple Techniques: –Moving Average –Exponential Smoothing –Exponential Smoothing with Trend –Exponential Smoothing with Trend & Seasonality –(Auto) Regression –Bass Diffusion

89 89 Moving Average Average Error: 666.29 (Over Estimates) Std Dev of Error: 4,094 (< Std Dev. In Sales)

90 90 Exponential Smoothing Average Error: 786.13 (less accurate) Std.Dev in Errors: 4,308 (less precise)

91 91 Exponential Smoothing w/ Trend Average Error: -167.58 (more accurate) Std.Dev in Errors: 5,065 (less precise)

92 92 Exp. Smoothing w/ Trend & Seasonality Seasonality by: Day of Week Week of Month Week of Year

93 93 Regression Using Previous 2 week Previous 3 weeks … Might Use Week of Month too

94 94 Previous 2 Weeks Average Error: 2,096 Std Dev in Error: 4,929

95 95 Previous 3 Weeks Average Error: 1,837 Std Dev in Error: 5,010

96 96 Previous 4 Weeks Average Error: 1664 Std Dev in Error: 5367.29

97 97 Bass Diffusion Average Error: -1536.17 Std Dev in Error: 5324.07

98 98 Conclusion Simple moving average provides best forecast on a weekly basis Exponential smoothing better on a monthly basis Now let’s build a demand distribution given a forecast Idea: Compare Actual Sales to Forecasted Sales through the ratio Accuracy means Average should be 1 Precision means Std Dev should be small

99 99 Demand Distribution We know the forecast is WRONG But it does give us some information What Actual Sales will be is uncertain, but we can develop a distribution for it What are the chances Actual Sales are larger than X? Smaller than Y? …

100 100 Actual to Forecast Ratios  the Avg is 1.1 (What does that mean?) σ the Std Dev is 0.87 Ratio < 1 Over forecast Ratio > 1 Under forecast

101 101 Translate Forecast into Demand Distribution Assuming things continue as they have… If the forecast is 100, what do we expect actual sales to be? So, if  is the Average Actual/Forecast ratio & F is the forecast, Expected demand is  F What is the spread of actual demand about this mean value? 110

102 102 Translate Forecast into Demand Distribution If things continue as they have, it is natural to assume we will draw from the historical distribution of Actual/Forecast ratios. So, if  = 1, If the forecast is 100, what should the distribution of Actual Sales be? –The mean should be? –The std dev of actual demand about this mean value should be? Why? 100 100 σ

103 103 Demand Distribution The Actual/Forecast ratio has a distribution with mean  and std dev σ If  = 1, the distribution for Actual Sales is just F, the forecast, times the Actual/Forecast ratio so, it has mean F and std dev σF If  ≠ 1, the distribution for Actual Sales is just F, the forecast, times the Actual/Forecast ratio so, it has mean  F and std dev σF

104 104 Common Cause Variability So, the common cause variability in demand for SKU 8295 that we will need to protect against with safety stock is about 87% of the Forecasted demand! Just working with Raw Sales –Std Dev/Average = 93%

105 105 Common Cause Variability On the other hand, with monthly sales Working with Raw Sales –Std Dev/Average = 74% Working with Exponential Smoothing forecast –Std Dev/Average = 36%


Download ppt "1 1 ISyE 6203 Variability Basics John H. Vande Vate Fall 2011."

Similar presentations


Ads by Google