Previously Optimization Probability Review Inventory Models Markov Decision Processes
Agenda Hwk Projects Markov Decision Processes Queues
Markov Decision Processes (MDP) States i=1,…,n Possible actions in each state Reward R(i,k) of doing action k in state i Law of motion: P(j | i,k) probability of moving i j after doing action k
MDP f(i) = largest expected current + future profit if currently in state i f(i,k) = largest expected current+future profit if currently in state i, will do action k f(i) = max k f(i,k) f(i,k) = R(i,k) + ∑ j P(j|i,k) f(j) f(i) = max k [R(i,k) + ∑ j P(j|i,k) f(j)]
MDP as LP f(i) = max k [R(i,k) + ∑ j P(j|i,k) f(j)] Idea: f(i) decision variables piecewise linear function min ∑ j f(i) s.t. f(i) ≥ R(i,k) + ∑ j P(j|i,k) f(j) for all i,k
MDP Examples Breast cancer screening Stock options Airline ticket pricing Elevator scheduling Reservoir management
Queues (Ch 14) Queue = waiting line the “system” image from
Examples Airport security Customer service line Checkout Doctor’s office ER Canada: scheduling operations Elevators
Performance Measures T time in system T q waiting time (time in queue) N #customers in system N q #customers in queue system arrivals departures queue servers W = E[T] W q = E[T q ] L = E[N] L q = E[N q ] fraction of time servers are busy (utilization)
Randomness is Key arrivals every 15 min (not random) processing times random with mean of 13 min (exponential random variable) waiting time customer #