Feng Zhiyong Tianjin University Fall planning
What is Planning? Given: – Initial state, goal state, and actions Find: – A plan: a sequence of actions that when applied, beginning with the initial state, transforms the world into a goal state
11.1 The Planning Problem 11.2 Planning with State-Space Search 11.3 Partial-Order Planning 11.4 Planning Graphs 11.5 Planning with Propositional Logic 11.6 Analysis of Planning Approaches 11.7 Summary
an ordinary problem-solving agent may face: ◦ The problem-solving agent can be overwhelmed by irrelevant actions 。 ◦ Difficult to find a good heuristic function. ◦ Inefficient: cannot take advantage of problem decomposition
The agent is the sole cause of change in the environment World is accessible (i.e. the agent knows all it need to know about the environment) Closed World Assumption: ◦ State description lists all that is true ◦ Anything else is assumed false The planning task is very difficult, even with such a simplified framework!
Dressing ◦ Initial state: socks, shoes ◦ Goal state: socks on, shoes on correct feet, ◦ Actions: PutOnSock(f), PutOnShoe(f) Blocks World ◦ Initial state: some configuration of blocks on a table ◦ Goal State: another configuration (stacked?) ◦ Actions: Pickup(x), Putdown(x), Stack(x,y), Unstack(x,y) Shopping ◦ Initial state: at home, with no items ◦ Goal state: at home, having a list of items ◦ Actions: Go(store), Buy(item), etc…
Facts: ground literals with variables ◦ Poor Unknown At(Plane, Beijing) Situations: conjunction of facts ◦ At(Plane1, Beijing) ⋀ At(Plane2, Tianjin) ◦ Poor ⋀ Unknown Goal: conjunction of positive literals ◦ Variables allowed, assume all variables are existential ◦ Rich ⋀ Famous, At(Plane1, Xi’an)
Actions: ◦ Action name ◦ Preconditions: conjunction of positive literals that defines if action is legal/applicable ◦ Effects: conjunction of positive literals (called the add list) and negative literals (called the delete list) Action(Fly(p, from, to), PRECOND:AT(P,from) ⋀ Plane(p) ⋀ Airport(from) ⋀ Airport(to) EFFECT : ¬At(p,from) ⋀ At(p, to)) ◦ delete list, add list ◦ Assumption: everything stays the same unless explicitly on the delete list (avoids frame problem)
Result of an action: ◦ The positive literals in the effect are added to the state. ◦ Any negative literals in the effect that match existing positive literals in the state make the positive literals disappear. Exceptions: ◦ Positive literals already in the state are not added again. ◦ Negative literals that match with nothing in the state are ignored.
The planning problem can be seen as a search problem. We can move from one state of the problem to another in both a forward and backward direction because the actions are defined in terms of both preconditions and effects. Forward search: progression planning Backward search: regression planning
Progression: Forward Chaining ◦ Like state-space search except for representation ◦ Inefficient due to large situation space to explore Regression: Backward Chaining ◦ Start from the goal state and solve its sub- goals(preconditions) ◦ More efficient and goal-directed than progression (fewer applicable operators)
Forward Backward
The initial state of the search is the initial state from the planning problem. In general, each state will be a set of positive ground literals; literals not appearing are false. The actions that are applicable to a state are all those whose preconditions are satisfied. The successor state resulting from an action is generated by adding the positive effect literals and deleting the negative effect literals. (In the first-order case, we must apply the unifier from the preconditions to the effect literals.) Note that a single successor function works for all planning problems - a consequence of using an explicit action representation. The goal test checks whether the state satisfies the goal of the planning problem. The step cost of each action is typically. Although it would be easy to allow different costs for different actions, this is seldom done by STRIPS planners.
Forward planning is equivalent to forward search and is very inefficient. In fact, it suffers from all the caveats of the underlying search algorithm. A better way to solve a planning problem is through backward state-space search, i.e. by starting at the goal and working our way back to the initial state. Advantage: we need only consider moves that achieve part of the goal! In STRIPS, there is no problem in finding the predecessors of a state.
the goal in our 10-airport air cargo problem Searching backwards is sometimes called regression planning PreCon: Consistent: Not be consistent ◦ Any positive effects of A that appear in G are deleted. ◦ Each precondition literal of A is added, unless it already appears. Substitution in FOL
State-space search (forward and backward) is not efficient enough. Can we perform A* style search with an admissible heuristic? Key Assumption Sub-goals are independent of each other ◦ Divide and conquer the problem without worrying about other parts of the problem e.g. With putting on socks: the order doesn’t matter; putting on left sock first doesn’t preclude putting on the right ◦ Whole plan is sum of all sub-plans
This heuristic is: ◦ –Optimistic (admissible) when the goals do interact i.e. an action in a subplandeletes a goal achieved by another subplan. ◦ –Pessimistic (inadmissible) when subplans contain redundant actions
This heuristic assumes that all actions have only positive effects. For example, if an action has the effects A and ¬B, the empty-delete list heuristic considers the action as if it only had the effect A. In that way, we assume that no action can delete the literals achieved by another action.
Up to now, plans have been totally ordered i.e. the exact temporal relationships between the actions are known: A i is after A i-1 and before A i+1 In partially ordered plans, we don’t have to specify the temporal relationships between all the actions. In practice, this means that we can identify actions that happen in any order.
Total-order planner (linear): ◦ – Maintains a partial solution as a “totally ordered” list of steps found so far ◦ – e.g. STRIPS ◦ – e.g. Situation-space progression/regression planners Partial-order planner (non-linear): ◦ – Only maintains partial order ◦ – Constraints on the ordering of steps in the plan
Principle of Least Commitment: don’t make an ordering choice unless required to do so ◦ – Property of partial-order planners (POP) ◦ – Not a property of situation-space planners: they commit to an ordering when an operator is applied Keep the ordering choice as general as possible Reduces the amount of backtracking needed ◦ – Don’t waste time undoing steps
Ordering constraints: ◦ – S1 < S2: S1 before S2 ◦ – S1 must occur before S2 ◦ but not necessarily immediately before it ◦ – Thin links Causal constraints: ◦ – S1 S2: S1 achieves c for S2 ◦ – S1 has a literal c in its effect list that is needed to satisfy part of the precondition for S2 c
An action threatens a causal link when it might delete the goal that the link satisfies. Example: in the dynamic blocks world, pickup(a) has “handempty”in its effects so it threatens the link (putdown(c,b),handempty,pickup(d)) The consequences of adding an action that breaks a causal link into the plan are serious. We have to make sure to remove the threat by demotion (move earlier) or promotion (move later).
A open (i.e. unsatisfied) precondition is one that does not have a causal link to it. How is an open precondition p for step S solved? ◦ – Step addition: add new plan step R that contains p in its Effects list ◦ – Simple establishment: find an existing plan step R prior to S that has p in its Effects list ◦ – Then add a causal and ordering links from R to S To keep the search focused, the planner only adds steps that achieve an open precondition
POP is sound and complete POP Plan is a solution if: ◦ All preconditions are supported (by causal links), i.e., no open conditions. ◦ No threats ◦ Consistent temporal ordering By construction, the POP algorithm reaches a solution plan
“Fast Planning Through Planning Graph Analysis,” Artificial Intelligence, Propositionalize actions and situations Construct a planning graph ◦ Levels (e.g. time steps) with potential action nodes Include persistence actions (inactions) to deal with frame prob. ◦ Link actions to situation nodes between each level ◦ Indicate which situation descriptions are mutually exclusive with “mutex links”
Planning graphs work only for propositional planning problems
Inconsistent effects: one action negates an effect of the other. For example Eat(Cake) and the persistence of Have(Cake) have inconsrstent effects because they disagree on the effect Have ( Cake). Interference: one of the effects of one action is the negation of a precondition of the other. For example Eat(Cake) interferes with the persistence of Have(Cake) by negating its precondition. Competing needs: one of the preconditions of one action is mutually exclusive with a precondition of the other. For example, Bake( Cake) and Eat (Cake) are mutex because they compete on the value of the Have( Cake) precondition.
Literals increase monotonically: Once a literal appears at a given level, it will appear at all subsequent levels. This is because of the persistence actions; once a literal shows up, persistence actions cause it to stay forever. Actions increase monotonically: Once an action appears at a given level, it will appear at all subsequent levels. This is a consequence of literals' increasing; if the preconditions of an action appear at one level, they will appear at subsequent levels, and thus so will the action. Mutexes decrease monotonically: If two actions are mutex at a given level Ai, then they will also be mutex for all previous levels at which they both appear. The same holds for mutexes between literals. It might not always appear that way in the figures, because the figures have a simplification: they display neither literals that cannot hold at level Si nor actions that cannot be executed at level Ai. We can see that "mutexes decrease monotonically" is true if you consider that these invisible literals and actions are mutex with everything.
“Planning as Satisfiability,” ◦ Initial state ⋀all possible action descriptions ⋀ goal Recall that a planning environment can be expressed in situation calculus ◦ Axioms of the form a→b (rather ﹁ a ⋁ b) Recall that plans are considered to be a conjunction of sub-goals: ◦ Start state ∧axioms ∧ goals
The basic idea with SAT-Plan: ◦ Describe the environment in situation calculus ◦ Propositionalize all the axioms disjunctions),enumerated for each of an arbitrary number of steps ◦ Conjoin all instantiated rules with the initial state and goal descriptions This provides us with a PL formula in CNF, which we can try to solve using HC, SA, Tabu, GAs, etc.
Initial state: Time: T0. Some propositions are unknown Time: T1 successor--state axioms KB: initial state ⋀ successor-state axioms ⋀ Goal precondition axioms: Action exclusion axioms ◦ ¬(Fly(P2,J FK, SFO) 0 ⋀ Fly(P2,J FK, LAX)')
The number of clauses is larger, For example, with 10 time steps, 12 planes, and 30 airports, the complete action exclusion axiom has 583 million clauses. (T x Planes x I Airportls 2 ) reduced to a set of binary predicates (symbol splitting) Fly(P1, SFO, JFK) 0, T x Act x P x O Parallel actions Fly(P1, SFO, JFK) 0 and Fly(P2, JFK, SFO) 0
State-space search (STRIPS) can be directed using logic, but is still incomplete Partially-ordered planners are complete, but are practically limited in the number of steps they can accurately plan Planning was sort of a “dead” AI research area for a while
Since 1992, there have been several new approaches to the planning task discovered (e.g.Graph-Plan and SAT-Plan) that can find plans upto thousands of steps long D. Weld, “Recent advances in AI planning,” AI Magazine,1999 ◦ Excellent coverage of these new approaches
Planning agents search to find a sequence of actions to achieve a goal using a flexible representation of states, operators, goals, plans ◦ – STRIPS language describes actions in terms of their preconditions and effects Not feasible to search through the entire space as was done with search agents ◦ Regression planning focuses the search ◦ STRIPS assumes sub-goals are independent ◦ POP uses principle least commitment, declobbering
Partial-Order Planning (POP) is a sound and complete planning algorithm, but can be limited by plan length Recent advances in AI planning reduce the planning environment to other problems (Graphs, SAT formulas) that can be solved using other methods