Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University.

Slides:



Advertisements
Similar presentations
Facility Location Decisions
Advertisements

Hadi Goudarzi and Massoud Pedram
Markov Decision Process
Scheduling.
“Make to order or Make to Stock Model: and Application” S.Rajagopalan By: ÖNCÜ HAZIR.
Chapter 2: Modeling with Linear Programming & sensitivity analysis
Anthony Sulistio 1, Kyong Hoon Kim 2, and Rajkumar Buyya 1 Managing Cancellations and No-shows of Reservations with Overbooking to Increase Resource Revenue.
Preference Elicitation Partial-revelation VCG mechanism for Combinatorial Auctions and Eliciting Non-price Preferences in Combinatorial Auctions.
Experimental Design, Response Surface Analysis, and Optimization
SHORT-TERM FINANCIAL MANAGEMENT Chapter 4 – Inventory Management Prepared by Patty Robertson May not be used without permission.
INVENTORY MANAGEMENT Chapter Twenty McGraw-Hill/Irwin
1 FIFTH International Conference on ``Analysis of Manufacturing Systems -- Production Management'‘ Zakynthos, Greece, 2005 QUEUEING MODELS FOR MANAGING.
CHAPTER 8 A NNEALING- T YPE A LGORITHMS Organization of chapter in ISSO –Introduction to simulated annealing –Simulated annealing algorithm Basic algorithm.
An Introduction to Markov Decision Processes Sarah Hickmott
Reinforcement Learning & Apprenticeship Learning Chenyi Chen.
Infinite Horizon Problems
Planning under Uncertainty
1 HEURISTICS FOR DYNAMIC SCHEDULING OF MULTI-CLASS BASE-STOCK CONTROLLED SYSTEMS Bora KAT and Zeynep Müge AVŞAR Department of Industrial Engineering Middle.
CAC for Multimedia Services in Mobile Cellular Networks : A Markov Decision Approach Speaker : Xu Jia-Hao Advisor : Ke Kai-Wei Date : 2004 / 11 / 18.
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
Reinforcement Learning Introduction Presented by Alp Sardağ.
Capacity Allocation to Support Customer Segmentation by Product Preference Guillermo Gallego Özalp Özer Robert Phillips Columbia University Stanford University.
An overview of design and operational issues of kanban systems M. S. AKTÜRK and F. ERHUN Presented by: Y. Levent KOÇAĞA.
Department of Computer Science Undergraduate Events More
Using Simulated Annealing and Evolution Strategy scheduling capital products with complex product structure By: Dongping SONG Supervisors: Dr. Chris Hicks.
Chapter 12 – Independent Demand Inventory Management
Inventory Management for Independent Demand
Dimitrios Konstantas, Evangelos Grigoroudis, Vassilis S. Kouikoglou and Stratos Ioannidis Department of Production Engineering and Management Technical.
Classifying optimization problems By the independent variables: –Integer optimization --- integer variables –Continuous optimization – real variables By.
Independent Demand Inventory Management
Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin 12 Financial and Cost- Volume-Profit Models.
Analysis of Algorithms
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
An Evaluation of Heuristic Methods for Determining the Best Table Mix in Full-Service Restaurants Sheryl E. Kimes and Gary M. Thompson Cornell University.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.
Quantitative Techniques Deepthy Sai Manikandan. Topics: Linear Programming Linear Programming Transportation Problem Transportation Problem Assignment.
Inventory Models in SC Environment By Debadyuti Das.
Dynamic Pricing with Risk Analysis and Target Revenues Baichun Xiao Long Island University.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Operational Research & ManagementOperations Scheduling Economic Lot Scheduling 1.Summary Machine Scheduling 2.ELSP (one item, multiple items) 3.Arbitrary.
1 Inventory Control with Time-Varying Demand. 2  Week 1Introduction to Production Planning and Inventory Control  Week 2Inventory Control – Deterministic.
IT Applications for Decision Making. Operations Research Initiated in England during the world war II Make scientifically based decisions regarding the.
Heuristic Methods for the Single- Machine Problem Chapter 4 Elements of Sequencing and Scheduling by Kenneth R. Baker Byung-Hyun Ha R2.
Dr. Anis Koubâa CS433 Modeling and Simulation
1 Chapter 16 Relevant Costs and Benefits for Decision Making.
DEPARTMENT/SEMESTER ME VII Sem COURSE NAME Operation Research Manav Rachna College of Engg.
Department of Computer Science Undergraduate Events More
Copyright © 2014 Curt Hill Algorithm Analysis How Do We Determine the Complexity of Algorithms.
EMGT 5412 Operations Management Science Nonlinear Programming: Introduction Dincer Konur Engineering Management and Systems Engineering 1.
© 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Computer Simulation Henry C. Co Technology and Operations Management,
Managerial Economics Linear Programming
Rough-Cut Capacity Planning in SCM Theories & Concepts
Heuristic Optimization Methods
Analysis of Algorithms
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Exam 1 Review/Instructions
Professor Arne Thesen, University of Wisconsin-Madison
Optimal Electricity Supply Bidding by Markov Decision Process
Machine Requirements 8/24/04 Paul A. Jensen
Markov Decision Problems
Hidden Markov Models (cont.) Markov Decision Processes
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Reinforcement Learning (2)
Reinforcement Learning (2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Presentation transcript:

Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University of Eichstätt-Ingolstadt Fifth International Conference on „Analysis of Manufacturing Systems - Production Management“ Zakynthos, Mai 24 th, 2005

May 24, Structure 1.Introduction 2.Decision Problem 3.Markov Decision Model 4.Solution Procedure 5.Numerical Results

May 24, Introduction Revenue Management (RM) –Service industries (air transportation, hotels, car rental, etc.) –Manufacturing industries (steel, paper, aluminum, etc.) see Kniker/Burman (2001) –Implementations of RM systems have increased profits by 2 – 10%.

May 24, Introduction Which kind of manufacturing company could potentially use revenue management to increase the bottom line? a)high fixed costs b)a short-term increase of capacity to meet demand peaks is very expensive or even not possible c)demand fluctuates over time d)customers are willing to pay different prices for essentially the same product

May 24, Steps of a RM system 1. Customer segmentation: Customers are segmented into customer classes, where each customer class has its own data of lead time specified by the customers of this class price (profit margin) per order of these customers processing time per order of these customers probability of arrival for an order of the customer class in a given time period (to be estimated ) 2. Optimization of Capacity: Assignment of capacity booking limits to each customer class Rejection of customers with lower profit margins when certain capacity utilization levels are reached.

May 24, Decision problem Assumptions One single bottleneck in the manufacturing process Orders: specific price, volume, and lead time (due date) one arrival in a given time period arrivals are independent of one another Products can be made to stock Limited inventory capacity Infinite planning horizon

May 24, Decision problem 1.Accept order? yes/no 2.If yes; how much inventory should be used?

May 24, Notation N order classes, n  {1,..., N}. Each order n can be assigned to one order class. Parameters for orders of class n : m n : profit margin u n : capacity usage l n : lead time p n : probability of arriving dummy order class 0: Orders:

May 24, Notation Inventory : I max : maximum inventory level i : inventory level, i  {0,1,..., I max }. h : inventory holding costs per unit of inventory per period Inventory level i is expressed in periods that the machine needed to produce that inventory

May 24, Notation States (n, c, i)  S (state space): n : order class of the order arrived at the beginning of the current period c : number of periods the machine is reserved for already accepted but not finished yet orders, c  {0,1,..., H}. i : current inventory level H-c : available capacity in the considered horizon H Problem Size: nci

May 24, Sequence of Decisions accept, do not raise inventory and satisfy order with r units from inventory : n > 0  (c+u n  l n + i  u n  i), r  {r min,…,r max } D3(r) := D2 := reject and raise inventory level : c = 0  i < I max D1 := reject and do not raise inventory level D4 := accept, satisfy order completely from inventory and raise inventory level : n > 0  c = 0  u n  i D[(n, c, i)] = n: order class c: machine usage i: inventory level

May 24, Rewards R D1 = R D2 = – h ·i R D3(r) = m n – h · (i – r) R D4 = m n – h · (i – u n ) D1: reject and do not raise inventory level D2: reject and raise inventory level D3: accept and do not raise inventory level D4: accept and raise inventory level

May 24, Time-discrete Markov Decision Process Objective: find the best action for every state in order to maximize the long- term average reward per period |D| = Number of decision possibilities

May 24, p m,  (n, c, i)  {S : c  0}, m  {0,..., N} 0,else P D1 [(n, c, i), (m, c – 1, i)] = n, m: order class c: machine usage i: inventory level Transition Probabilities = p m,  (n, c, i)  S, m  {0,..., N}, r  {min(max(0, c + u n – l n ), min(i, u n ),..., min(i, u n )} 0,else P D3(r) [(n, c, i), (m, c + u n – r – 1, i – r )] = D1: reject and do not raise inventory level D3: accept and do not raise inventory level

May 24, P D2 [(n, 0, i), (m, 0, i + 1)] = p m,  n, m  {0,..., N}, i  {0,..., I max – 1} 0, else p m,  (n, c, i)  S, m  {0,..., N} 0,else P D4 [(n, 0, i), (m, 0, i – u n + 1)] = n, m: order class c: machine usage i: inventory level Transition Probabilities p m,  n, m  {0,..., N}, i  {0,..., I max } 0,else P D1 [(n, 0, i), (m, 0, i)] = P D3(r) [(n, 0, i), (m, max(0,u n – r – 1), i – r )] = … D1: reject and do not raise inventory level D2: reject and raise inventory level D3: accept and do not raise inventory level D4: accept and raise inventory level

May 24, This Markov Decision Process can be solved via standard methods, e.g. linear programming, policy iteration or value iteration. But, for large problem instances the computational times are too long (see Numerical Results). Solution Procedure

May 24, Heuristic: Objective:Find good policies in acceptable runtimes Idea: Reject "bad" order classes and accept "good" order classes "goodness" of an order class: relative profit margin m n / u n [profit/cap. usage] Solution Procedure

May 24, Consider an ”accept if possible” order class, e.g. n =4 or n =5: Acceptance levels increase with lower machine usages or higher inventory levels Solution Procedure

May 24, Consider an “accept in favorable situations” order class, e.g. n =2 or n =3: Acceptance levels increase with lower machine usages or higher inventory levels Solution Procedure

May 24, Policies can be approximated by an N-dimensional vector A T = (a 1, a 2,..., a N ) the element a n specifies: at what inventory level i can orders of class n be accepted if machine usage is 0? a n  {max(0, u n – l n ),..., I max } Example: a n = 5 Solution Procedure a n = 5

May 24, The result is a combinatorial optimization problem in N dimensions. Idea for heuristic: evaluate the average reward of certain policies A T = (a 1, a 2,..., a N ) via simulation and find good policies by simulation comparisons. Example: N = 5 Solution Procedure

May 24, Solution Procedure Simulation comparison of two policies: Each policy corresponds to a Markov Reward Process. Both Markov chains are simulated and at the end of a replication the average reward of each policy is estimated. If the difference of average rewards > 0 with a certain confidence level, the simulation comparison stops, otherwise another replication is made.

May 24, Solution Procedure Policy i : order classes n  {0,1,…,i} are completely rejected order classes n  {i+1,…,N} are completely accepted R(i) : average reward of policy i

May 24, Solution Procedure Procedure: Sort order classes ascending by their relative profit margins Close order classes successively n = 1, 2,... until maximum of average reward is reached The last order class that was closed has the maximum reward R * ; it is called n*

May 24, Further improvement of the policy: Close half of the order class right of n*, n=n*+1, Open half of n* Determine which policy offers maximum of average reward Solution Procedure

May 24, Numerical Results problem class12345 number of states10,00050,000100,000500,0001,000,000 number of instances 100 order classes[5,20] [10,30][20,50] maximum inventory relative profit margin [1,3] maximum lead time inventory cost0.01 trafic intensity[1.5,2.5] Problem classes

May 24, Numerical Results problem class12345 proportion optimum [%] runtime value iteration [sec.] average [%] minimum [%] maximum [%] standard deviation [%] Average reward per period FCFS-policy vs. value iteration algorithm

May 24, Numerical Results problem class123 proportion optimum [%] running time heuristic [sec.] running time value iteration [sec.] average [%] minimum [%]0.0 maximum [%] standard deviation [%] Average reward per period Heuristic procedure vs. value iteration algorithm

May 24, Numerical Results problem class12345 runtime FCFS [sec.] runtime heuristic [sec.] average [%] minimum [%]0.0 maximum [%] standard deviation [%] Average reward per period FCFS-policy vs. heuristic procedure

May 24, Numerical Results order class123 lead time1042 profit margin20,00 €60,00 €100,00 € capacity usage444 relative profit margin5,0015,0025,00 relative traffic intensity 60%30%10% Example with three order classes

May 24, Numerical Results Average reward per period Heuristic procedure vs. value iteration algorithm

May 24, Numerical Results Average reward per period Heuristic procedure vs. value iteration algorithm

May 24, Numerical Results Average reward per period Heuristic procedure vs. value iteration algorithm

May 24, Thank you for your attention.