Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University.

Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University of Eichstätt-Ingolstadt Fifth International Conference on „Analysis of Manufacturing Systems - Production Management“ Zakynthos, Mai 24 th, 2005

May 24, 2005 2 Structure 1.Introduction 2.Decision Problem 3.Markov Decision Model 4.Solution Procedure 5.Numerical Results

May 24, 2005 3 Introduction Revenue Management (RM) –Service industries (air transportation, hotels, car rental, etc.) –Manufacturing industries (steel, paper, aluminum, etc.) see Kniker/Burman (2001) –Implementations of RM systems have increased profits by 2 – 10%.

May 24, 2005 4 Introduction Which kind of manufacturing company could potentially use revenue management to increase the bottom line? a)high fixed costs b)a short-term increase of capacity to meet demand peaks is very expensive or even not possible c)demand fluctuates over time d)customers are willing to pay different prices for essentially the same product

May 24, 2005 5 Steps of a RM system 1. Customer segmentation: Customers are segmented into customer classes, where each customer class has its own data of lead time specified by the customers of this class price (profit margin) per order of these customers processing time per order of these customers probability of arrival for an order of the customer class in a given time period (to be estimated ) 2. Optimization of Capacity: Assignment of capacity booking limits to each customer class Rejection of customers with lower profit margins when certain capacity utilization levels are reached.

May 24, 2005 6 Decision problem Assumptions One single bottleneck in the manufacturing process Orders: specific price, volume, and lead time (due date) one arrival in a given time period arrivals are independent of one another Products can be made to stock Limited inventory capacity Infinite planning horizon

May 24, 2005 7 Decision problem 1.Accept order? yes/no 2.If yes; how much inventory should be used?

May 24, 2005 8 Notation N order classes, n  {1,..., N}. Each order n can be assigned to one order class. Parameters for orders of class n : m n : profit margin u n : capacity usage l n : lead time p n : probability of arriving dummy order class 0: Orders:

May 24, 2005 9 Notation Inventory : I max : maximum inventory level i : inventory level, i  {0,1,..., I max }. h : inventory holding costs per unit of inventory per period Inventory level i is expressed in periods that the machine needed to produce that inventory

May 24, 2005 10 Notation States (n, c, i)  S (state space): n : order class of the order arrived at the beginning of the current period c : number of periods the machine is reserved for already accepted but not finished yet orders, c  {0,1,..., H}. i : current inventory level H-c : available capacity in the considered horizon H Problem Size: nci

May 24, 2005 11 Sequence of Decisions accept, do not raise inventory and satisfy order with r units from inventory : n > 0  (c+u n  l n + i  u n  i), r  {r min,…,r max } D3(r) := D2 := reject and raise inventory level : c = 0  i < I max D1 := reject and do not raise inventory level D4 := accept, satisfy order completely from inventory and raise inventory level : n > 0  c = 0  u n  i D[(n, c, i)] = n: order class c: machine usage i: inventory level

May 24, 2005 12 Rewards R D1 = R D2 = – h ·i R D3(r) = m n – h · (i – r) R D4 = m n – h · (i – u n ) D1: reject and do not raise inventory level D2: reject and raise inventory level D3: accept and do not raise inventory level D4: accept and raise inventory level

May 24, 2005 13 Time-discrete Markov Decision Process Objective: find the best action for every state in order to maximize the long- term average reward per period |D| = Number of decision possibilities

May 24, 2005 14 p m,  (n, c, i)  {S : c  0}, m  {0,..., N} 0,else P D1 [(n, c, i), (m, c – 1, i)] = n, m: order class c: machine usage i: inventory level Transition Probabilities = p m,  (n, c, i)  S, m  {0,..., N}, r  {min(max(0, c + u n – l n ), min(i, u n ),..., min(i, u n )} 0,else P D3(r) [(n, c, i), (m, c + u n – r – 1, i – r )] = D1: reject and do not raise inventory level D3: accept and do not raise inventory level

May 24, 2005 15 P D2 [(n, 0, i), (m, 0, i + 1)] = p m,  n, m  {0,..., N}, i  {0,..., I max – 1} 0, else p m,  (n, c, i)  S, m  {0,..., N} 0,else P D4 [(n, 0, i), (m, 0, i – u n + 1)] = n, m: order class c: machine usage i: inventory level Transition Probabilities p m,  n, m  {0,..., N}, i  {0,..., I max } 0,else P D1 [(n, 0, i), (m, 0, i)] = P D3(r) [(n, 0, i), (m, max(0,u n – r – 1), i – r )] = … D1: reject and do not raise inventory level D2: reject and raise inventory level D3: accept and do not raise inventory level D4: accept and raise inventory level

May 24, 2005 16 This Markov Decision Process can be solved via standard methods, e.g. linear programming, policy iteration or value iteration. But, for large problem instances the computational times are too long (see Numerical Results). Solution Procedure

May 24, 2005 17 Heuristic: Objective:Find good policies in acceptable runtimes Idea: Reject "bad" order classes and accept "good" order classes "goodness" of an order class: relative profit margin m n / u n [profit/cap. usage] Solution Procedure

May 24, 2005 18 Consider an ”accept if possible” order class, e.g. n =4 or n =5: Acceptance levels increase with lower machine usages or higher inventory levels Solution Procedure

May 24, 2005 19 Consider an “accept in favorable situations” order class, e.g. n =2 or n =3: Acceptance levels increase with lower machine usages or higher inventory levels Solution Procedure

May 24, 2005 20 Policies can be approximated by an N-dimensional vector A T = (a 1, a 2,..., a N ) the element a n specifies: at what inventory level i can orders of class n be accepted if machine usage is 0? a n  {max(0, u n – l n ),..., I max } Example: a n = 5 Solution Procedure a n = 5

May 24, 2005 21 The result is a combinatorial optimization problem in N dimensions. Idea for heuristic: evaluate the average reward of certain policies A T = (a 1, a 2,..., a N ) via simulation and find good policies by simulation comparisons. Example: N = 5 Solution Procedure

May 24, 2005 22 Solution Procedure Simulation comparison of two policies: Each policy corresponds to a Markov Reward Process. Both Markov chains are simulated and at the end of a replication the average reward of each policy is estimated. If the difference of average rewards > 0 with a certain confidence level, the simulation comparison stops, otherwise another replication is made.

May 24, 2005 23 Solution Procedure Policy i : order classes n  {0,1,…,i} are completely rejected order classes n  {i+1,…,N} are completely accepted R(i) : average reward of policy i

May 24, 2005 24 Solution Procedure Procedure: Sort order classes ascending by their relative profit margins Close order classes successively n = 1, 2,... until maximum of average reward is reached The last order class that was closed has the maximum reward R * ; it is called n*

May 24, 2005 25 Further improvement of the policy: Close half of the order class right of n*, n=n*+1, Open half of n* Determine which policy offers maximum of average reward Solution Procedure

May 24, 2005 26 Numerical Results problem class12345 number of states10,00050,000100,000500,0001,000,000 number of instances 100 order classes[5,20] [10,30][20,50] maximum inventory 10152050100 relative profit margin [1,3] maximum lead time 151520423466471 inventory cost0.01 trafic intensity[1.5,2.5] Problem classes

May 24, 2005 27 Numerical Results problem class12345 proportion optimum [%]99939400 runtime value iteration [sec.]82.3880.91584.13681.33741.1 average [%]4.43.84.02.4-8.5 minimum [%]0.0 -3.0-69.9 maximum [%]18.333.934.222.28.6 standard deviation [%]4.76.26.03.913.6 Average reward per period FCFS-policy vs. value iteration algorithm

May 24, 2005 28 Numerical Results problem class123 proportion optimum [%]999394 running time heuristic [sec.]42.892.8115.3 running time value iteration [sec.]82.3880.91584.1 average [%]1.71.81.5 minimum [%]0.0 maximum [%]17.933.923.1 standard deviation [%]2.94.83.1 Average reward per period Heuristic procedure vs. value iteration algorithm

May 24, 2005 29 Numerical Results problem class12345 runtime FCFS [sec.]15.062.8115.370.5143.2 runtime heuristic [sec.]42.892.858.3254.8206.9 average [%]2.72.12.52.01.7 minimum [%]0.0 maximum [%]16.619.232.118.411.7 standard deviation [%]3.84.15.12.82.5 Average reward per period FCFS-policy vs. heuristic procedure

May 24, 2005 30 Numerical Results order class123 lead time1042 profit margin20,00 €60,00 €100,00 € capacity usage444 relative profit margin5,0015,0025,00 relative traffic intensity 60%30%10% Example with three order classes

May 24, 2005 31 Numerical Results Average reward per period Heuristic procedure vs. value iteration algorithm

May 24, 2005 34 Thank you for your attention.

Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University.

Similar presentations

Presentation on theme: "Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University.

Similar presentations

Presentation on theme: "Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University."— Presentation transcript:

Similar presentations

About project

Feedback