Download presentation
Presentation is loading. Please wait.
Published byRachel Tucker Modified over 9 years ago
1
Planning and Execution with Phase Transitions Håkan L. S. Younes Carnegie Mellon University Follow-up paper to Younes & Simmons’ “Solving Generalized Semi-Markov Processes using Continuous Phase-Type Distributions” (AAAI’04)
2
YounesPlanning and Execution with Phase Transitions2 Planning with Time and Probability Temporal uncertainty When to schedule service? WorkingFailedServiced Service F 1 Return F 2 Fail F 3 Memoryless distributions MDP
3
YounesPlanning and Execution with Phase Transitions3 The Role of Phases Memoryless property of exponential distribution makes MDPs tractable Many phenomena are not memoryless Lifetime of product or computer process Phases introduce memory into state space Modeling tool—not part of real world (phases are hidden variables)
4
YounesPlanning and Execution with Phase Transitions4 Phase-Type Distributions Time to absorption in Markov chain with n transient states and one absorbing state 1 Exponential 21 p 1 (1 – p) 1 2 Two-phase Coxian n21 n -phase Erlang
5
YounesPlanning and Execution with Phase Transitions5 Phase-Type Approximation F(x)F(x) 1 x 24681012 0 0 Weibull(8,4.5) Erlang(7.3/8,8) Erlang(7.3/4,4) Exponential(1/7.3)
6
YounesPlanning and Execution with Phase Transitions6 Solving MDPs with Phase Transitions Solve MDP as if phases were observable Maintain belief distribution over phase configurations during execution AAAI’04 paper: Simulate phase transitions Use Q MDP value method to select actions OK, because there are typically no actions that are useful for observing phases
7
YounesPlanning and Execution with Phase Transitions7 Factored Event Model Asynchronous events and actions Each event/action has: Enabling condition e (Phase-type) distribution G e governing the time e must remain enabled before it triggers A distribution p e (s′; s) determining the probability that the next state is s′ if e triggers in state s
8
YounesPlanning and Execution with Phase Transitions8 Event Example Computer cluster with crash event and reboot action for each computer Crash event for computer i : i = up i ; p(s′; s) = 1 iff s up i and s′ up i Reboot action for computer i: i = up i ; p(s′; s) = 1 iff s up i and s′ up i
9
YounesPlanning and Execution with Phase Transitions9 Event Example Cluster with two computers t = 0t = 0.6t = 0.9 crash 2 up 1 up 2 up 1 up 2 up 1 up 2 crash 1
10
YounesPlanning and Execution with Phase Transitions10 Exploiting Structure Value iteration with ADDs: Factored state representation means that some variable assignments may be invalid Phase is one for disabled events Binary encoding of phases with n ≠ 2 k Value iteration with state filter f :
11
YounesPlanning and Execution with Phase Transitions11 Phase Tracking Infer phase configurations from observable features of the environment Physical state of process Current time Transient analysis for Markov chains The probability of being in state s at time t
12
YounesPlanning and Execution with Phase Transitions12 Phase Tracking Belief distribution changes continuously Use fixed update interval Independent phase tracking for each event Q matrix is n n for n -phase distribution Phase tracking is O(n) for Erlang distribution
13
YounesPlanning and Execution with Phase Transitions13 The Foreman’s Dilemma When to enable “Service” action in “Working” state? Working c = 1 Failed c = 0 Serviced c = –0.1 Service Exp(10) Return Exp(1) Fail G
14
YounesPlanning and Execution with Phase Transitions14 Foreman’s Dilemma: Policy Performance n = 8, = 0.5 n = 4, = 0.5 n = 8, simulate n = 4, simulate SMDP x Percent of optimal 0510152025303540 50 60 70 80 90 100 Failure-time distribution: W(1.6x,4.5)
15
YounesPlanning and Execution with Phase Transitions15 Foreman’s Dilemma: Enabling Time for Action optimal n = 8, = 0.5 n = 8, simulate SMDP x 0510152025303540 Service enabling time 20 0 10 30
16
YounesPlanning and Execution with Phase Transitions16 System Administration Network of n machines Reward rate c(s) = k in states where k machines are up One crash event and one reboot action per machine At most one action enabled at any time (single agent)
17
YounesPlanning and Execution with Phase Transitions17 System Administration: Policy Performance n = 4, = n = 2, = n = 4, simulate n = 2, simulate n = 1 3 Expected discounted reward 40 245678910 Number of machines 20 25 30 35 45 Reboot-time distribution: U(0,1)
18
YounesPlanning and Execution with Phase Transitions18 System Administration: Planning Time no filter, n = 4 pre-filter, n = 5 pre-filter, n = 4 filter, n = 5 filter, n = 4 no filter, n = 5 3245678910 Number of machines Planning time 10 0 10 –1 10 1 10 2 10 3 10 4
19
YounesPlanning and Execution with Phase Transitions19 Discussion Phase-type approximations add state variables, but structure can be exploited Use approximate solution methods (e.g., Guestrin et al., JAIR 19) Improved phase tracking using transient analysis instead of phase simulation Recent CTBN work with phase-type dist.: modeling—not planning
20
YounesPlanning and Execution with Phase Transitions20 Tempastic-DTP A tool for GSMDP planning: http://sweden.autonomy.ri.cmu.edu/tempastic/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.