9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs.

9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs

Heuristics for Belief-Space Planning

Evaluating search/planning: Theoretical “Worst-case”  Look at the complexity  Worst-case complexity of most search/planning problems is NP- complete or higher.  What would it tell us other than “find something else easier (if less interesting) to do”  Consider formal restrictions on domains under which complexity may be lower..  These restrictions may not be natural.. “Average-case”  Average-case complexity would be better  But much harder to analyze  What distribution of problems to use?  Similar issues arise in empirical analyses

Evaluating Search/Planning: Empirical Random problems  Look at actual performance on problems. WHICH PROBLEMS?  Randomly generated problems  Which distribution? (hardest problems may live in small phase-transition regions as in SAT)  Find the phase-transition regions, generate random problems there  But who said such problems are at all related to problems that occur? “Real” or “Benchmark” problems  Use “real world” problems  Fine as far as the customers of that problem are boss is concerned, but not clear whether the claims will carry over to any other problems  May have to do analysis to figure out what is it about that domain that makes certain approaches work well  Develop many “benchmark” domains inspired by various real world problems and use them to evaluate the coverage of a planner  Easy to abstract way the critical characteristics when developing benchmarks  See Cushing’s analysis of temporal planning domains

Heuristics for Conformant Planning  First idea: Notice that “Classical planning” (which assumes full observability) is a “relaxation” of conformant planning  So, the length of the classical planning solution is a lowerbound (admissible heuristic) for conformant planning  Further, the heuristics for classical planning are also heuristics for conformant planning (albeit not very informed probably)  Next idea: Let us get a feel for how estimating distances between belief states differs from estimating those between states

Three issues: How many states are there? How far are each of the states from goal? How much interaction is there between states?  For example if the length of plan for taking S1 to goal is 10, S2 to goal is 10, the length of plan for taking both to goal could be anywhere between 10 and Infinity depending on the interactions [Notice that we talk about “state” interactions here just as we talked about “goal interactions” in classical planning] Need to estimate the length of “combined plan” for taking all states to the goal World’s funniest joke (in USA) In addition to interactions between literals as in classical planning we also have interactions between states (belief space planning)

Belief-state cardinality alone won’t be enough…  Early work on conformant planning concentrated exclusively on heuristics that look at the cardinality of the belief state  The larger the cardinality of the belief state, the higher its uncertainty, and the worse it is (for progression)  Notice that in regression, we have the opposite heuristic—the larger the cardinality, the higher the flexibility (we are satisfied with any one of a larger set of states) and so the better it is  From our example in the previous slide, cardinality is only one of the three components that go into actual distance estimation.  For example, there may be an action that reduces the cardinality (e.g. bomb the place  ) but the new belief state with low uncertainty will be infinite distance away from the goal.  We will look at planning graph-based heuristics for considering all three components  (actually, unless we look at cross-world mutexes, we won’t be considering the interaction part…)

Planning Graph Heuristic Computation  Heuristics  BFS  Cardinality  Max, Sum, Level, Relaxed Plans  Planning Graph Structures  Single, unioned planning graph (SG)  Multiple, independent planning graphs (MG)  Single, labeled planning graph (LUG)  [Bryce, et. al, 2004] – AAAI MDP workshop Note that in classical planning progression didn’t quite need negative interaction analysis because it was a complete state already. In belief-space planning the negative interaction analysis is likely to be more important since the states in belief state may have interactions.

Regression Search Example Actions: A1: M P => K A2: M Q => K A3: M R => L A4: K => G A5: L => G Initially: (P V Q V R) & (~P V ~Q) & (~P V ~R) & (~Q V ~R) & M Goal State: G G (G V K) (G V K V L) A4 A1 (G V K V L V P) & M A2 A5 A3 G or K must be true before A4 For G to be true after A4 (G V K V L V P V Q) & M (G V K V L V P V Q V R) & M Each Clause is Satisfied by a Clause in the Initial Clausal State -- Done! (5 actions) Initially: (P V Q V R) & (~P V ~Q) & (~P V ~R) & (~Q V ~R) & M Clausal States compactly represent disjunction to sets of uncertain literals – Yet, still need heuristics for the search (G V K V L V P V Q V R) & M Enabling precondition Must be true before A1 was applied

Using a Single, Unioned Graph P M Q M R M P Q R M A1 A2 A3 Q R M K L A4 G A5 P A1 A2 A3 Q R M K L P G A4 K A1 P M Heuristic Estimate = 2 Not effective Lose world specific support information Union literals from all initial states into a conjunctive initial graph level Minimal implementation

Using Multiple Graphs P M A1 P M K P M K A4 G R M A3 R M L R M L G A5 P M Q M R M Q M A2 Q M K Q K A4 G M G K A1 M P G A4 K A2 Q M G A5 L A3 R M Same-world Mutexes Memory Intensive Heuristic Computation Can be costly Unioning these graphs a priori would give much savings …

Using a Single, Labeled Graph (joint work with David E. Smith) P Q R A1 A2 A3 P Q R M L A1 A2 A3 P Q R L A5 Action Labels: Conjunction of Labels of Supporting Literals Literal Labels: Disjunction of Labels Of Supporting Actions P M Q M R M K A4 G K A1 A2 A3 P Q R M G A5 A4 L K A1 A2 A3 P Q R M Heuristic Value = 5 Memory Efficient Cheap Heuristics Scalable Extensible Benefits from BDD’s ~Q & ~R ~P & ~R ~P & ~Q (~P & ~R) V (~Q & ~R) (~P & ~R) V (~Q & ~R) V (~P & ~Q) M True Label Key Labels signify possible worlds under which a literal holds

What about mutexes?  In the previous slide, we considered only relaxed plans (thus ignoring any mutexes)  We could have considered mutexes in the individual world graphs to get better estimates of the plans in the individual worlds (call these same world mutexes)  We could also have considered the impact of having an action in one world on the other world.  Consider a patient who may or may not be suffering from disease D. There is a medicine M, which if given in the world where he has D, will cure the patient. But if it is given in the world where the patient doesn’t have disease D, it will kill him. Since giving the medicine M will have impact in both worlds, we now have a mutex between “being alive” in world 1 and “being cured” in world 2!  Notice that cross-world mutexes will take into account the state-interactions that we mentioned as one of the three components making up the distance estimate.  We could compute a subset of same world and cross world mutexes to improve the accuracy of the heuristics…  …but it is not clear whether or not the accuracy comes at too much additional cost to have reasonable impact on efficiency.. [see Bryce et. Al. JAIR submission]

Connection to CGP  CGP—the “conformant Graphplan”—does multiple planning graphs, but also does backward search directly on the graphs to find a solution (as against using these to give heuristic estimates)  It has to mark sameworld and cross world mutexes to ensure soundness..

Heuristics for sensing  We need to compare the cumulative distance of B1 and B2 to goal with that of B3 to goal  Notice that Planning cost is related to plan size while plan exec cost is related to the length of the deepest branch (or expected length of a branch)  If we use the conformant belief state distance (as discussed last class), then we will be over estimating the distance (since sensing may allow us to do shorter branch)  Bryce [ICAPS 05—submitted] starts wth the conformant relaxed plan and introduces sensory actions into the plan to estimate the cost more accurately B1 B2 B3

A good presentation just on BDDs from the inventors: http://www.cs.cmu.edu/~bryant/presentations/arw00.ppt

Symbolic FSM Analysis Example  K. McMillan, E. Clarke (CMU) J. Schwalbe (Encore Computer)  Encore Gigamax Cache System  Distributed memory multiprocessor  Cache system to improve access time  Complex hardware and synchronization protocol.  Verification  Create “simplified” finite state model of system (10 9 states!)  Verify properties about set of reachable states  Bug Detected  Sequence of 13 bus events leading to deadlock  With random simulations, would require  2 years to generate failing case.  In real system, would yield MTBF < 1 day.

A set of states is a logical formula A transition function is also a logical formula Projection is a logical operation Symbolic Projection

Symbolic Manipulation with OBDDs  Strategy  Represent data as set of OBDDs  Identical variable orderings  Express solution method as sequence of symbolic operations  Sequence of constructor & query operations  Similar style to on-line algorithm  Implement each operation by OBDD manipulation  Do all the work in the constructor operations  Key Algorithmic Properties  Arguments are OBDDs with identical variable orderings  Result is OBDD with same ordering  Each step polynomial complexity [From Bryant’s slides]

Transition function as a BDD Belief state as a BDD BDDs for representing States & Transition Function

Argument F Restriction Execution Example 0 a b c d 1 0 a c d 1 Restriction F[b=1] 0 c d 1 Reduced Result

Don’t look beyond this point

Sensing: More things under the mat  Sensing extends the notion of goals too.  Check if Rao is awake vs. Wake up Rao  Presents some tricky issues in terms of goal satisfaction…!  Handling quantified effects and preconditions in the presence of sensing actions  Rm* can satisfy the effect forall files remove(file); without KNOWING what are the files in the directory!  Sensing actions can have preconditions (as well as other causative effects)  The problem of OVER-SENSING (Sort of like the initial driver; also Sphexishness) [XII/Puccini project]  Handling over-sensing using local-closedworld assumptions  Listing a file doesn’t destroy your knowledge about the size of a file; but compressing it does. If you don’t recognize it, you will always be checking the size of the file after each and every action  A general action may have both causative effects and sensing effects  Sensing effect changes the agent’s knowledge, and not the world  Causative effect changes the world (and may give certain knowledge to the agent)  A pure sensing action only has sensing effects; a pure causative action only has causative effects.  The recent work on conditional planning has considered mostly simplistic sensing actions that have no preconditions and only have pure sensing effects.  Sensing has cost!

A* vs. AO* Search A* search finds a path in in an “or” graph AO* search finds an “And” path in an And-Or graph AO*  A* if there are no AND branches AO* typically used for problem reduction search

9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs.

Similar presentations

Presentation on theme: "9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs.

Similar presentations

Presentation on theme: "9/14: Belief Search Heuristics Today: Planning graph heuristics for belief search Wed: MDPs."— Presentation transcript:

Similar presentations

About project

Feedback