A Markov Decision Model for Determining Optimal Outpatient Scheduling Jonathan Patrick Telfer School of Management University of Ottawa.

Slides:

Advertisements

Similar presentations

CPSC 422, Lecture 9Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 9 Jan, 23, 2015.

Advertisements

Markov Decision Processes (MDPs) read Ch utility-based agents –goals encoded in utility function U(s), or U:S  effects of actions encoded in.

ANDREW MAO, STACY WONG Regrets and Kidneys. Intro to Online Stochastic Optimization Data revealed over time Distribution of future events is known Under.

Decision Theoretic Planning

Load Balancing of Elastic Traffic in Heterogeneous Wireless Networks Abdulfetah Khalid, Samuli Aalto and Pasi Lassila

1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.

Healthcare Operations Management © 2008 Health Administration Press. All rights reserved. 1.

An Introduction to Markov Decision Processes Sarah Hickmott

Distributed Association Control in Shared Wireless Networks Krishna C. Garikipati and Kang G. Shin University of Michigan-Ann Arbor.

Infinite Horizon Problems

Planning under Uncertainty

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

SA-1 1 Probabilistic Robotics Planning and Control: Markov Decision Processes.

Markov Decision Models for Order Acceptance/Rejection Problems Florian Defregger and Heinrich Kuhn Florian Defregger and Heinrich Kuhn Catholic University.

Reinforcement Learning Mitchell, Ch. 13 (see also Barto & Sutton book on-line)

1 Stochastic Modeling for Clinical Scheduling by Ji Lin Reference: Muthuraman, K., and Lawley, M. A Stochastic Overbooking Model for Outpatient Clinical.

Reinforcement Learning: Learning algorithms Yishay Mansour Tel-Aviv University.

Reinforcement Learning Yishay Mansour Tel-Aviv University.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Previously Optimization Probability Review Inventory Models Markov Decision Processes.

Introducing Information into RM to Model Market Behavior INFORMS 6th RM and Pricing Conference, Columbia University, NY Darius Walczak June 5, 2006.

ADITI BHAUMICK ab3585. To use reinforcement learning algorithm with function approximation. Feature-based state representations using a broad characterization.

1 Chapter 5 Flow Lines Types Issues in Design and Operation Models of Asynchronous Lines –Infinite or Finite Buffers Models of Synchronous (Indexing) Lines.

Modeling & Simulation What can they offer? March 28, 2012 Ottawa, ON Waiting Time Management Strategies for Scheduled Health Care Services: A Workshop.

Planning Demand and Supply in a Supply Chain

Supplement D Waiting Line Models Operations Management by R. Dan Reid & Nada R. Sanders 3rd Edition © Wiley 2005 PowerPoint Presentation by Roger B. Grinde,

CPSC 422, Lecture 9Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 9 Sep, 28, 2015.

Supplement C Waiting Line Models Operations Management by R. Dan Reid & Nada R. Sanders 4th Edition © Wiley 2010.

Supplement D Waiting Line Models

Utilities and MDP: A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor SIUC.

CPSC 7373: Artificial Intelligence Lecture 10: Planning with Uncertainty Jiang Bian, Fall 2012 University of Arkansas at Little Rock.

Appointment Systems - a Stochastic and Fluid Approach Michal Penn The William Davidson Faculty of Industrial Engineering and Management Technion - Israel.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

DATA-DRIVEN APPOINTMENT SCHEDULING IN THE PRESENCE OF NO-SHOWS Michele Samorani Linda LaGanga October

Reinforcement Learning Yishay Mansour Tel-Aviv University.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

INTRODUCTION TO Machine Learning

CHAPTER 16: Reinforcement Learning. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Introduction Game-playing:

1 The Impact of Walk-In Ratio on Outpatient Department Fenghueih Huarng Dept. of Business Adm, Southern Taiwan Univ. of Technology.

Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.

Restless Multi-Arm Bandits Problem (RMAB): An Empirical Study Anthony Bonifonte and Qiushi Chen ISYE8813 Stochastic Processes and Algorithms 4/18/2014.

Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.

Decision Making Under Uncertainty CMSC 471 – Spring 2041 Class #25– Tuesday, April 29 R&N, material from Lise Getoor, Jean-Claude Latombe, and.

THE INVENTORY ROUTING PROBLEM

CSE 473Markov Decision Processes Dan Weld Many slides from Chris Bishop, Mausam, Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer.

Reinforcement Learning Dynamic Programming I Subramanian Ramamoorthy School of Informatics 31 January, 2012.

Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.

Markov Decision Process (MDP)

MDPs and Reinforcement Learning. Overview MDPs Reinforcement learning.

Modeling Long Term Care and Supportive Housing Marisela Mainegra Hing Telfer School of Management University of Ottawa Canadian Operational Research Society,

Some Final Thoughts Abhijit Gosavi. From MDPs to SMDPs The Semi-MDP is a more general model in which the time for transition is also a random variable.

OPSM 301: Operations Management Session 13-14: Queue management Koç University Graduate School of Business MBA Program Zeynep Aksin

Possible actions: up, down, right, left Rewards: – 0.04 if non-terminal state Environment is observable (i.e., agent knows where it is) MDP = “Markov Decision.

Markov Decision Processes AIMA: 17.1, 17.2 (excluding ), 17.3.

Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.

CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.

Introducing Information into RM to Model Market Behavior INFORMS 6th RM and Pricing Conference, Columbia University, NY Darius Walczak June 5, 2006.

Supplement C Developing the Master Production Schedule

Making complex decisions

The Rich and the Poor: A Markov Decision Process Approach to Optimizing Taxi Driver Revenue Efficiency Huigui Rong, Xun Zhou, Chang Yang, Zubair Shafiq,

Professor Arne Thesen, University of Wisconsin-Madison

Markov Decision Processes

Markov Decision Processes

Chapter 3: The Reinforcement Learning Problem

Supplement D Waiting Line Models

13. Acting under Uncertainty Wolfram Burgard and Bernhard Nebel

Chapter 3: The Reinforcement Learning Problem

Chapter 3: The Reinforcement Learning Problem

Chapter 17 – Making Complex Decisions

Presentation transcript:

A Markov Decision Model for Determining Optimal Outpatient Scheduling Jonathan Patrick Telfer School of Management University of Ottawa

Motivation  The unwarranted skeptic and the uncritical enthusiast  Outpatient clinics in Canada receiving strong encouragement to switch to open access  Basic operations research would claim that there is a cost to providing same day access  Does the benefit outweigh the costs?

Trade-off  Any schedule needs to balance system- related benefits/costs - revenue, overtime, idle time,… versus patient related benefits – access, continuity of care,….  Available levers include the decision as to how many new requests to serve today and how many requests to book in advance into each day.

Scheduling Decisions Day 1Day 2Day 3Day 4Day 5 New Demand Day 2 Day 3 Day 1

Literature  Plenty of evidence that overbooking is advantageous in the presence of no-shows (work by Lawley et al and by Lawrence et al)  Also evidence that a two day booking window outperforms open access (work by Liu et al and by Lawrence and Chen)  Old trade-off between tractability of the model and complexity

Model Aims  To create a model that Incorporates a show rate that is dependent on the appointment lead time Gives managers the ability to determine the number of new requests to serve today The number of requests to book into each future day (called the Advanced Booking Policy – ABP) Allows the policy to depend on the current booking slate and demand.

Markov Decision Process Model  Decision Epochs Made once a day after today’s demand has arrived but before any appointments  State Current ABP (w), queue size (x) and demand (y)  Actions How many of today’s demand to serve today (b) Whether to change the current ABP (a)

Markov Decision Process Model  Transitions Stochastic element is new demand New queue size is equal to current queue size (x) minus today’s slate (x  w) plus any new demand not serviced today (y-b) New demand represented by random variable D.

Markov Decision Process Model  Costs/Rewards System Related: revenue, overtime, idle time Patient Related: lead time For switching the ABP

Bellman Equation  Used a discounted (but with a discount rate of 0.99), infinite horizon model to avoid arbitrary terminal rewards  Can be solved to optimality

Assumptions/Limitations  Advance bookings are done on a FCFS basis  Today’s demand arrives before any booking decisions need to be made  Service times are deterministic  Show rate dependent on size of queue at time of service instead of at time of booking  Immediate changes to ABP may mean that previous bookings need to be shifted  Does not account for fact that some bookings have to be booked in advance

Clinic Types Considered

Six Scenarios for each Clinic Type 1.Base scenario Demand equal to capacity Show rate based on research by Gallucci All requests can be serviced the same day 2.Demand > Capacity 3.Demand < Capacity 4.Some requests must be booked in advance 5.Same day bookings given a show probability of 1 6.Show probability with a steeper decline

Performance Results  Clinics #1,2,3: OA and MDP policy result in almost identical profits Same day access ranges from 89% to 100% (max lead time 1 day)  Clinics #4,5,6: MDP slightly outperforms OA (by less than 2%) Same day access ranges from 84% to 100% (max lead time 2 days)  Clinics #7,8,9: MDP vastly outperforms OA in all scenarios (by as much as 70%) Same day access ranges from 28% to 98% (max lead time 4 days)  For all clinics, MDP provides a significant reduction in throughput variation and peak workload

Optimal Policy (base scenario, w=11, x=0) Day

Optimal Policy (base scenario, w=11, x=0) Day 1Day

Performance Trends  MDP performed best when demand was high (e.g. when demand > capacity and when same day show rate was guaranteed).  MDP approaches OA as the lead time cost increases  Presence of revenue makes OA much more attractive  Maximum booking window in any scenario tested was 4 days  MDP manages to perform as well even when revenue is present by sacrificing some throughput in order to reduce overtime and idle time costs.

Conclusion  Model provides a booking policy that takes into account no-shows and reacts to the congestion in the system  Simulation results suggest that it achieves better results (same or higher objective, more predictable throughput) than open access with minimal cost to the patient in terms of lead times  Enhancements to the model certainly possible including the inclusion of stochastic services times, the transition to a continuous time setting, the possibility of a multi-doctor clinic….  Currently in discussion with local clinic to build enhanced model and test it.

Thank You!

Optimal Policy (base scenario, w=11, x=0) Number of New Requests Given Same Day Service y

Optimal Policy (base scenario, w=11, x=0) Number of New Requests Given Same Day Service y '0''1''2''3''4''5''6''7''8''9''10''11'

ScenarioPolicy Lead Time Costs Average DailyCost/ProfitAppointment Lead Times THOTITActual Percent diff from OA01234 Show Rate with Same Day = 100% OA 100.0%12.5% MDP %1.0%9.4% %63.89%34.76%1.34%0.01%0.00% %1.8%8.4% %71.51%28.19%0.30%0.00% %6.1%9.0% %87.37%12.63%0.00% Increased Demand (Demand = 12) OA 88.0%15.7%10.1% MDP %2.9%9.2% %27.76%44.56%23.06%4.37%0.25% %6.9%6.2% %64.82%34.43%0.76%0.00% %14.7%9.4% %97.91%2.09%0.00% Base Case OA 88.0%6.9%18.9% MDP %0.7%16.5% %66.44%32.29%1.26%0.01%0.00% %1.7%15.8% %81.52%18.36%0.12%0.00% %4.5%17.1% %94.89%5.11%0.00% Steep Decline OA 88.0%6.9%18.9% MDP %0.8%18.1% %78.01%21.83%0.16%0.00% %1.5%17.2% %85.06%14.92%0.03%0.00% %4.0%17.4% %94.35%5.65%0.00% Advanaced Bookings OA 84.6%5.6%21.0% MDP %0.6%19.1% %43.64%52.99%3.32%0.05%0.00% %1.4%18.5% %55.14%44.36%0.50%0.00% %3.7%19.6% %65.98%34.01%0.00% Demand = 8 OA %2.1%31.7% MDP %0.0%30.5% %90.39%9.60%0.01%0.00% %0.2%30.4% %92.63%7.37%0.00% %0.8%30.7% %97.09%2.91%0.00%

ScenarioPolicy Lead Time Costs Average DailyCost/ProfitAppointment Lead Times THOTITActual % diff from OA01234 Increased Demand (Demand = 12) OA 88.0%15.7%10.2% MDP %10.3%6.8% %83.88%16.12%0.00% %12.3%7.8% %91.33%8.67%0.00% %15.7%10.2% %100.00%0.00% Show Rate with Same Day = 100% OA 100.0%12.5% MDP %6.0%9.0% %87.10%12.90%0.00% %8.1%10.0% %91.96%8.04%0.00% %12.5% %100.00%0.00% Base Case OA 88.0%6.9%18.9% MDP %2.6%16.0% %87.43%12.56%0.02%0.00% %3.5%16.5% %91.54%8.46%0.00% %6.2%18.3% %98.64%1.36%0.00% Show Rate with Steep Decline OA 88.0%6.9%18.9% MDP %4.0%17.4% %94.35%5.65%0.00% %4.7%17.7% %95.98%4.02%0.00% %6.9%18.9% %100.00%0.00% Advanaced Bookings OA 84.6%5.6%21.0% MDP %1.9%18.6% %61.19%34.48%4.19%0.14%0.00% %2.8%19.0% %63.14%36.82%0.05%0.00% %4.8%20.4% %68.61%31.39%0.00% Decreased Demand (Demand =8) OA %2.1%31.7% MDP %0.5%30.5% % 95.87%4.13%0.00% %0.6%30.5% % 96.27%3.73%0.00% %1.3%31.0% % 98.52%1.48%0.00%

ScenarioPolicy Lead Time Costs Average DailyCost/ProfitAppointment Lead Times THOTITActual Percent diff from OA012 Increased Demand (Demand = 12) OA 88.0%15.7%10.2% MDP %11.4%7.4% %88.59%11.41%0.00% %13.7%8.7% %95.46%4.54%0.00% %15.7%10.2% %100.00%0.00% Show Rate with Same Day = 100% OA 100.0%12.5% MDP %8.3%10.1% %92.43%7.57%0.00% %10.1%11.1% %95.82%4.18%0.00% %12.5% %100.00%0.00% Base Case OA 88.0%6.9%18.9% MDP %3.1%16.3% %89.82%10.18%0.00% %6.9%18.9% %100.00%0.00% %6.9%18.9% %100.00%0.00% Show Rate with Steep Decline OA 88.0%6.9%18.9% MDP %4.9%17.8% %96.48%3.52%0.00% %6.4%18.6% %99.11%0.89%0.00% %6.9%18.9% %100.00%0.00% Advanaced Bookings OA 84.6%5.6%21.0% %30.00%0.00% MDP %2.6%18.9% %62.27%37.64%0.09% %3.4%19.4% %65.12%34.87%0.01% %5.6%21.0% %70.00%30.00%0.00% Decreaded Demand (Demand = 8) OA %2.1%31.7% %0.00% MDP %0.5%30.5% %96.04%3.96%0.00% %0.7%30.7% %96.74%3.26%0.00% %1.9%31.5% %99.78%0.22%0.00%