© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes.

© 2015 McGraw-Hill Education. All rights reserved. Introduction Stochastic processes –Evolve over time in a probabilistic manner Markov chain –A type of stochastic process –Special property: how the process will evolve in the future depends only on current state Independent of past events –May be continuous-time type or discrete type 2

© 2015 McGraw-Hill Education. All rights reserved. Introduction Transition matrix –Gives probabilities for what the state will be next time Many important systems can be modeled as a discrete time or continuous time Markov chain This chapter focuses on: –How to design a discrete time Markov chain for optimal performance 3

© 2015 McGraw-Hill Education. All rights reserved. 19.1 A Prototype Example Manufacturer with one key machine –Machine deteriorates rapidly in quality and output –End of week inspection classifies state 4

© 2015 McGraw-Hill Education. All rights reserved. A Prototype Example Transition matrix –Created by analyzing historical data –Shows relative frequency of transition from the state in one week to the state in the following week 5

© 2015 McGraw-Hill Education. All rights reserved. A Prototype Example State 3 is an absorbing state –Once machine becomes inoperable, it remains inoperable Must be replaced Replacement process –Takes one week to complete –Lost profit of $2,000 –Cost of replacing machine is $4,000 6

© 2015 McGraw-Hill Education. All rights reserved. A Model for Markov Decision Processes A policy is stationary –Whenever the system is in state i, the rule for making the decision is always the same A policy is deterministic –Whenever the system is in state i, the rule for making the decision definitely chooses one particular decision 14

© 2015 McGraw-Hill Education. All rights reserved. Linear Programming and Optimal Policies The optimal policy R b Linear programming formulation issue –D ik values are integers –Continuous values required for linear programming 19

© 2015 McGraw-Hill Education. All rights reserved. Linear Programming and Optimal Policies Solution: redefine D ik Resulting policy involving probability distributions called a randomized policy Example of a randomized policy 20

© 2015 McGraw-Hill Education. All rights reserved. Linear Programming and Optimal Policies Key conclusion –Optimal policy found by the simplex method is deterministic Rather than randomized Solve the prototype example by linear programming –See the model given on next slide 23

© 2015 McGraw-Hill Education. All rights reserved. Linear Programming and Optimal Policies Applying the simplex method, the resulting optimal solution is: –Leave machine as is if in state 0 or 1 –Overhaul machine if in state 2 –Replace machine if in state 3 25

© 2015 McGraw-Hill Education. All rights reserved. 19.4 Conclusions Markov decision process –Powerful tool for optimizing performance of discrete time Markov chain processes Common objective –Find a policy for each state of the system that minimizes the expected average cost per unit time Solution methods –Exhaustive enumeration and linear programming 26

© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes.

Similar presentations

Presentation on theme: "© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes.

Similar presentations

Presentation on theme: "© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes."— Presentation transcript:

Similar presentations

About project

Feedback