A Hierarchical Approach to Model-based Reactive Planning in Large State Spaces Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Brian C. Williams Joint with Seung H. Chung
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Outline Model-based programming A Simple model-based executive (Livingstone) The need for model-based reactive planning The Burton model-based reactive planner
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Objective: Embedded languages that reason from hardware models. (Reactive Model-based Programming) Polar Lander Leading Diagnosis: Legs deployed during descent. Noise spike on leg sensors latched by software monitors. Laser altimeter registers 50ft. Begins polling leg monitors to determine touch down. Latched noise spike read as touchdown. Engine shutdown at ~50ft. Mars Mission Failures, 2000: Climate Orbiter Polar Lander
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Model-based Programs Interact Directly with State Embedded programs interact with plant sensors and actuators: Read sensors Set actuators Model-based programs interact with plant state: Read state Write state Embedded Program S Plant Obs Cntrl Model-based Embedded Program S Plant Problem: Programmer must must map between state and sensors/actuators. Solution: Model-based executive maps between state and sensors/actuators. S’ Model-based Executive ObsCntrl
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Orbital Insertion Example EngineAEngineB Science Camera Turn camera off and engine on EngineAEngineB Science Camera
Programmer specifies abstract state evolutions Model Temporal planner Model-based Executive Command goals Observations Flight System Control RT Control Layer State Thrust Goals Attitude Point(a) Engine Off Delta_V(direction=b, magnitude=200) Power Model-based Program Evolves Hidden State Closed Valve Open Stuckopen Stuckclosed OpenClose inflow = outflow = 0 OrbitInsert():: (do-watching ((EngineA = Firing) OR (EngineB = Firing)) (parallel (EngineA = Standby) (EngineB = Standby) (Camera = Off) (do-watching (EngineA = Failed) (when-donext ( (EngineA = Standby) AND (Camera = Off) ) (EngineA = Firing))) (when-donext ( (EngineA = Failed) AND (EngineB = Standby) AND (Camera = Off) ) (EngineB = Firing)))) Programmer specifies plant model Model specifies Mode transitions Mode behavior Reactive Model-based Programming Language: Asserts state Queries state Executes conditionally Preempts Iterates Executes concurrently
Model Temporal planner Model-based Executive Commands State Goals Observations Flight System Control RT Control Layer Thrust Goals Attitude Point(a) Engine Off Delta_V(direction=b, magnitude=200) Power Model-based Executive Reasons from Plant Model State Estimates Reconfigure & Repair Estimate & Diagnose State Goals ObservationsCommands Goal: Achieve Thrust Open four valves Engine Off
Model Temporal planner Model-based Executive Command goals Observations Flight System Control RT Control Layer State Thrust Goals Attitude Point(a) Engine Off Delta_V(direction=b, magnitude=200) Power Model-based Executive Reasons from Plant Model State Estimates Reconfigure & Repair Estimate & Diagnose State Goals Goal: Achieve Thrust Diagnose: Valve fails stuck closed Switch to backup
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Outline Model-based programming A Simple model-based executive (Livingstone) The need for model-based reactive planning The Burton model-based reactive planner
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology A simple model-based executive (Livingstone) commanded NASA’s Deep Space One probe courtesy NASA JPL Started: January 1996 Launch: October 15th, 1998 Remote Agent Experiment: May, 1999
Livingstone [Williams & Nayak, AAAI96] State estimate Mode Reconfiguration Mode Estimation Command Observations Model Flight System Control RT Control Layer State goals
Thrust State estimate Mode Selection Mode Estimation CommandObservations Model Flight System Control RT Control Layer State goals Estimate current likely Modes Reconfigure modes to meet goals
State estimate Mode Selection Mode Estimation CommandObservations Model Flight System Control RT Control Layer State goals Mode Selection: Select a least cost set of allowed component modes that entail the current goal, and are consistent Mode Estimation: Select a most likely set of component mode transitions that are consistent with the model and observations arg max P t (m’) s.t. M(m’) ^ O(m’) is consistent arg min C t (m’) s.t. M(m’) entails G(m’) s.t. M(m’) is consistent
Mode Selection Mode Estimation CommandObservations Model Flight System Control RT Control Layer OpSat: arg min f(x) s.t. C(x) is satisfiable D(x) is unsatisfiable State estimate State goals arg max P t (m’) s.t. M(m’) ^ O(m’) is satisfiable arg min C t (m’) s.t. M(m’) entails G(m’) s.t. M(m’) is satisfiable
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Outline Model-based programming A simple model-based executive (Livingstone) The need for model-based reactive planning The Burton model-based reactive planner
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology DS 1 Attitude Control System z facing thrustersx facing thrusters 1553 bus Commands Data N2H4N2H4 He PDE SRU PDU GDE PASM DSEU PEPE BC Flight Computer Flight Computer BC PDE Livingstone reconfigured modes using one step commands. But How does the flight computer really open a valve? Requires turning on device drivers Requires repairing bus controllers Sending commands Powering down devices...
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Remote Terminal Remote Terminal Driver Bus Control Computer Valve Driver Valve Device modes are changed through indirect commanding. Communication paths are established by reconfiguring other devices. The task of reconfiguring devices in the proper order generalizes state-space planning to handle indirect effects. to achieve reactivity the all possible plans for all possible goal states should be pre-compiled (a generalization of universal plans). To achieve compactness we decompose these universal plans according to a goal/sub-goal hierarchy. How do we reconfigure a valve?
Model-based Execution & Reactive Planning Burton [Williams & Nayak, IJCAI97] State estimate Mode Selection Mode Estimation CommandObservations Reactive Planner Model Flight System Control RT Control Layer State goals
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Example: Driver Valve Command Sequence Valve Driver dr Valve vlv vcmdin dcmdin CommandsDriver StateValve State ME:dr = off, vlv = open MS:dr = off, vlv = closed MRPdcmdin = on ME:dr = on, vlv = open MRPdcmdin = close ME:dr = reset failure, vlv = open MRPdcmdin = reset ME:dr = on, vlv = open MRPdcmdin = off ME:dr = off, vlv = open Goal: No thrust
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Model-based Reactive Planning & Execution Limitation of configuration management Reactive Planning –Model compilation –Reversible Planning –Constructing Hierarchical Policies Execution
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology To achieve reactivity we eliminate all forms of search.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Model-based Reactive Planning Achieved by: 1.Eliminate Indirect Control... through Compilation 2.Eliminate Search for Goal Ordering... through Reversibility and Serialization 3.Eliminate Search to find Suitable Transitions... by Constructing Hierarchical Polices
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Model-based Reactive Planning Achieved by: 1.Eliminate Indirect Control... through Compilation 2.Eliminate Search for Goal Ordering... through Reversibility and Serialization 3.Eliminate Search to find Suitable Transitions... by Constructing Hierarchical Polices
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology To Handle Indirect Control... dcmd out = vcmd in off on failed resettable dcmd in = offdcmd in = on dcmd in = reset dcmd in = off dcmd in = dcmd out closed open stuck closed stuck open vcmd in = closevcmd in = open inflow = outflow vcmdindcmdin flowin flowout
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology... Compile Out Constraints closed open stuck closed stuck open vcmd in = closevcmd in = open off on failed resettable dcmd in = offdcmd in = on dcmd in = reset dcmd in = off dcmd in = dcmd out inflow = outflow dcmd out = vcmd in driver = on dcmd in = closedcmd in = open vcmdindcmdin flowin flowout
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology... Compile Out Constraints closed open stuck closed stuck open off on failed resettable dcmd in = offdcmd in = on dcmd in = reset dcmd in = off driver = on dcmd in = closedcmd in = open dcmdin
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology To Compile Out Constraints Eliminate intermediate variables. Transitions are conditioned on mode and control variables Generate transitions as prime implicates: i next(y i = e i ) where i is a conjunction of mode and control variable assignments. Prime implicates for transitions enumerated using OpSAT –40 seconds on SPARC 20 for 12,000 clause spacecraft model.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Model-based Reactive Planning Achieved by: 1.Eliminate Indirect Effects... through Compilation 2.Eliminate Search for Goal Ordering... through Reversibility and Serialization 3.Eliminate Search to find Suitable Transitions... by Constructing Hierarchical Polices
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology ValveDriver command Example –Current State: driver = on, valve = closed –Goal State: driver = off, valve = open –Achieving (driver = off) and then (valve = open) clobbers (driver = off) Why Search is Needed 1) An achieved goal can be clobbered by a subsequent goal. Achieve Valve goal before Driver goal
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Note: Component schematics tend not to have loops Remote Terminal Remote Terminal Bus Control Computer Valve Driver Work conjunctive goals upstream from outputs to inputs –Define: Causal Graph G of compiled transition system S vertices are state variables. edge from v i to v j if v j ’s transition is conditioned on v i. dcmd in Driver Valve –Requirement: The causal graph is acyclic.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology The only variables used to set some variable (y 7 ) is its ancestors, y 7 can be changed without affecting its descendants. Solution Unaffected Affected Safe to achieve goals in an upstream order. Simple check: –Number causal graph depth first –achieve goals in order of increasing depth first number.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Latch1 Switch data Example –Latch1 and Latch2 compete for the position of Switch if achieved concurrently. Why Search is Needed 2) Two goals can compete for the same variable in their subgoals. Latch2 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Sibling goals (7,4) may both need shared ancestors UnaffectedNot Shared Shared Unaffected Not Shared But ancestors no longer needed once goal (7) is satisfied. Solution: Solve one goal before starting next sibling (Serialization). Feature: Generates first control action of plan first!
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Latch1 Switch data Example –Assume Switch can be used once, –Then Latch1 must be latched before Latch2. Why Search is Needed 3) A state transition of a subgoal variable has irreversible effect. Latch2 But irreversible effects aren’t desirable for reactive planners Don’t allow irreversible actions ... Except to repair failure modes 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Solution: Mark Allowed Transitions/Assignments off on failed resettable dcmd in = offdcmd in = on dcmd in = reset dcmd in = off closed open stuck closed stuck open driver = on dcmd in = close dcmd in = open dcmd in Driver Valve Mark all control variable assignments allowed: 1 23
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Solution: Mark Allowed Transitions/Assignments off on failed resettable dcmd in = offdcmd in = on dcmd in = reset dcmd in = off closed open stuck closed stuck open driver = on dcmd in = closedcmd in = open dcmd in Driver Valve Mark all control variable assignments allowed: 1 23 For each mode variable v, in decreasing order of DF number: Select each transition of v, whose guard has only allowed assignments.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Solution: Mark Allowed Transitions/Assignments off on failed resettable dcmd in = offdcmd in = on dcmd in = reset dcmd in = off closed open stuck closed stuck open driver = on dcmd in = closedcmd in = open dcmd in Driver Valve Mark all control variable assignments allowed: 1 23 For each mode variable v, in decreasing order of DF number: Select each transition of v, whose guard has only allowed assignments. Given current assignment v = I for v: Mark assignments and transitions in SCC allowed. Find strongly connected component of selected transitions that contains I.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Solution: Mark Allowed Transitions/Assignments off on dcmd in = offdcmd in = on closed open stuck closed stuck open driver = on dcmd in = closedcmd in = open dcmd in Driver Valve Mark all control variable assignments allowed: 1 23 For each mode variable v, in decreasing order of DF number: Select each transition of v, whose guard has only allowed assignments. Given current assignment v = I for v: Mark assignments and transitions in SCC allowed. Find strongly connected component of selected transitions that contains I.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Solution: Mark Allowed Transitions/Assignments off on dcmd in = offdcmd in = on closed open stuck closed stuck open driver = on dcmd in = closedcmd in = open dcmd in Driver Valve Mark all control variable assignments allowed: 1 23 For each mode variable v, in decreasing order of DF number: Select each transition of v, whose guard has only allowed assignments. Given current assignment v = I for v: Mark assignments and transitions in SCC allowed. Find strongly connected component of selected transitions that contains I.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Solution: Mark Allowed Transitions/Assignments off on dcmd in = offdcmd in = on closed open driver = on dcmd in = closedcmd in = open dcmd in Driver Valve Mark all control variable assignments allowed: 1 23 For each mode variable v, in decreasing order of DF number: Select each transition of v, whose guard has only allowed assignments. Given current assignment v = I for v: Mark assignments and transitions in SCC allowed. Find strongly connected component of selected transitions that contains I.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Model-based Reactive Planning Achieved by: 1.Eliminate Indirect Effects... through Compilation 2.Eliminate Search for Goal Ordering... through Reversibility and Serialization 3.Eliminate Search to find Suitable Transitions... by Constructing Hierarchical Polices
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Solution Convert automata into hierarchical policies, one per automaton closed open cmd = closecmd = open fail Goal fail driver = on cmd = open idle driver = on cmd = close Current Open Closed Stuck Open Closed driver = on –Policy selects first transition towards achieving each automata goal state, given current state. –Policy maps goals to subgoals and commands, in proper order –Ensures only reversible transitions are taken, by only using transitions marked allowed.
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle driver = on cmd = close Current Open Closed Stuck Open Closed Goal cmd = onidle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed cmd = resetcmd = off Current: Driver = off, Valve = open 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle driver = on cmd = close Current Open Closed Stuck Open Closed Goal cmd = onidle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed cmd = resetcmd = off Current: Driver = off, Valve = open 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver Send: cmd = on fail Goal fail driver = on cmd = open idle driver = on cmd = close Current Open Closed Stuck Open Closed Goal cmd = onidle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed cmd = resetcmd = off Current: Driver = off, Valve = open 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology 1 2 Current: Driver = resettable, Valve = open Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle Current Open Closed Stuck Open Closed Goal idle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed cmd = resetcmd = off driver = on cmd = close Failed Resettable cmd = on
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle Current Open Closed Stuck Open Closed Goal cmd = onidle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed cmd = resetcmd = off driver = on cmd = close Current: Driver = resettable, Valve = open 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle Current Open Closed Stuck Open Closed Goal cmd = onidle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed Send cmd = reset cmd = off driver = on cmd = close Current: Driver = resettable, Valve = open 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle driver = on cmd = close Current Open Closed Stuck Open Closed Goal cmd = onidle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed Send cmd = close cmd = resetcmd = off Current: Driver = on, Valve = open 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle driver = on cmd = close Current Open Closed Stuck Open Closed cmd = reset Goal cmd = off cmd = onidle cmd = off Current On Off Resettable On Off Goal: Driver = off, Valve = closed Send cmd = off Current: Driver = on, Valve = closed 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Plan by passing sub-goals up causal graph ValveDriver fail Goal fail driver = on cmd = open idle driver = on cmd = close Current Open Closed Stuck Open Closed cmd = reset Goal cmd = off cmd = onidle Current On Off Resettable On Off cmd = off Goal: Driver = off, Valve = closed Success Current: Driver = off, Valve = closed 1 2
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Hierarchical, Model-based Reactive Planning Compile-time Analysis: –Compile-out interactions –Confirm schematics are loop free. –Depth first number variables. Periodic, Run-time Analysis: –Given initial state Identify allowed transitions and assignments –Given autonomous jump to failure state Identify allowed transitions and assignments Run-time Plan Execution: –Work conjunctive goals from outputs to inputs. –Achieve goals serially. –Only perform reversible transitions. –Lookup control actions and sub-goals in policies
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Complexity of Reactive Planning Worst Case per action: Depth * Sub-goal branch factor Average Cost per action: Sub-goal branch factor Valve 1 = openValve 2 = openDriver 1 = offDriver 2 = off Driver 1 = on CU = on Driver 2 = on CU = on
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology What If Plan is Not Serializable? –compose each cycle into a single component. Bus Control Computer Antenna AmplifierK-band Transmitter AmplifierK-band Transmitter What if causal graph G contains cycles? Solution: –Isolate the cyclic components (compute SCCs) New causal graph G’ is acyclic, Goals of G’ are serializable
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Composing Cyclic Components off on cmd T = off TransmitterAmplifier cmd T = on A = off off on cmd A = off cmd A = on T = on on T on A on T off A off T off A off T on A cmd T = offcmd T = on cmd A = off cmd A = on cmd A = off
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Policy for Composed Components on T on A on T off A off T off A off T on A cmd T = offcmd T = on cmd A = off cmd A = on cmd A = off cmd T = on Goal cmd T = on cmd A = onidle cmd A = off Current On T, On A On T, Off A Off T, Off A On T, On A On T, Off A idle cmd T = off cmd A = off Off T, Off A fail Off T, On A fail cmd A = offidle Off T, On A Problem: Composition grows exponential in space usage. Solution: Use BDD encoding (in progress).
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Model-based Reactive Planning 1.Compile away constraints from the model 2.Compile away cyclic components 3.Plan serially pursuing causal graph upstream 4.Generate actions using hierarchical policies Only performs reversible actions Responds to failure at each step Average cost per step = subgoal branching factor
Artificial Intelligence & Space Systems Laboratories Massachusetts Institute of Technology Current Demonstration Testbeds Air Force Tech Sat 21 flight NASA NMP ST-7 Phase A NASA Mercury Messenger on ground. MIT Spheres on Space Station NASA Robonaut, X-37, ISPP Multi-Rover Testbed Simulated Air Vehicles
Model-based Programming of Embedded Systems To survive decades embedded systems orchestrate complex regulatory and immune systems. Future systems will be programmed with models, describing themselves and their environments. Runtime kernels will be agile, deducing and planning by solving optimization problems with propositional constraints. Model-based reactive planners respond quickly to failure, while using compile-time analysis of structure to respond quickly and concisely to indirect effects.