Planning CSE 573 A handful of GENERAL SEARCH TECHNIQUES lie at the heart of practically all work in AI We will encounter the SAME PRINCIPLES again and again in this course, whether we are talking about GAMES LOGICAL REASONING MACHINE LEARNING These are principles for SEARCHING THROUGH A SPACE OF POSSIBLE SOLUTIONS to a problems.
Logistics PS1 Due Sat Reading Office Hours Midterm Tues 11/11 Chapters 7,8 & 11 (up thru 11.2) Office Hours Wed 4:30 Or after most classes Or email me Midterm Tues 11/11 © Daniel S. Weld
Knowledge Representation & Inference 573 Topics Reinforcement Learning Supervised Learning Planning Knowledge Representation & Inference Logic-Based Probabilistic Search Problem Spaces Agency © Daniel S. Weld
Ways to make “plans” Generative Planning Case-Based Planning Reason from first principles (knowledge of actions) Requires formal model of actions Case-Based Planning Retrieve old plan which worked on similar problem Revise retrieved plan for this problem Reinforcement Learning Act ”randomly” - noticing effects Learn reward, action models, policy © Daniel S. Weld
Generative Planning Input Output Description of (initial state of) world (in some KR) Description of goal (in some KR) Description of available actions (in some KR) Output Controller E.g. Sequence of actions E.g. Plan with loops and conditionals E.g. Policy = f: states -> actions © Daniel S. Weld
Input Representation Description of initial state of world E.g., Set of propositions: ((block a) (block b) (block c) (on-table a) (on-table b) (clear a) (clear b) (clear c) (arm-empty)) Description of goal: i.e. set of worlds or ?? E.g., Logical conjunction Any world satisfying conjunction is a goal (and (on a b) (on b c))) Description of available actions © Daniel S. Weld
Simplifying Assumptions Environment Static vs. Dynamic Fully Observable Partially Observable Deterministic Stochastic Instantaneous Durative Full vs. Partial satisfaction Perfect Noisy What action next? Percepts Actions © Daniel S. Weld
Classical Planning Environment Static Instantaneous Deterministic Perfect Fully Observable Full Oi (prec) (effects) I = initial state G = goal state [ I ] Oi Oj Ok Om [ G ] © Daniel S. Weld
Planning Outline The planning problem Representation As Search Forward Regression Heuristics Compilation to SAT Graphplan Reachability analysis & heuristics Planning under uncertainty © Daniel S. Weld
How Represent Actions? Simplifying assumptions STRIPS representation Atomic time Agent is omniscient (no sensing necessary). Agent is sole cause of change Actions have deterministic effects STRIPS representation World = set of true propositions Actions: Precondition: (conjunction of literals) Effects (conjunction of literals) © Daniel S. Weld
How Encode STRIPS Logic ? © Daniel S. Weld
Time in STRIPS Representation Action = function: worldState worldState Precondition says where function defined Effects say how to change set of propositions a north11 W0 W1 Note: strips doesn’t allow derived effects; you must be complete! effect: (and (agent-at 1 2) (not (agent-at 1 1))) north11 precond: (and (agent-at 1 1) (agent-facing north)) © Daniel S. Weld
} Action Schemata Instead of defining: Define a schema: pickup-A and pickup-B and … Define a schema: (:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1) (on-table ?ob1) (arm-empty)) :effect (and (not (clear ?ob1)) (not (on-table ?ob1)) (not (arm-empty)) (holding ?ob1))) Note: strips doesn’t allow derived effects; you must be complete! } © Daniel S. Weld
Time Arguments in Logic Initial Conditions Goal On(a, b, 0) Have(bluePaint, 0) Red(a, 0) On(b,a, ?) Blue(a, ?) Closed World Assumption © Daniel S. Weld
Planning Outline The planning problem Representation As Search Forward Regression Heuristics Compilation to SAT Graphplan Reachability analysis & heuristics Planning under uncertainty © Daniel S. Weld
Planning as Search Nodes Arcs Initial State Goal State World states Action executions The state satisfying the complete description of the initial conds Any state satisfying the goal propositions © Daniel S. Weld
Forward-Chaining World-Space Search Initial State Goal State A C B A B C © Daniel S. Weld
Backward-Chaining Search Thru Space of Partial World-States Problem: Many possible goal states are equally acceptable. From which one does one search? D C B A E Initial State is completely defined * * * C B A D E C D A B E © Daniel S. Weld
Planning as Search - Backward Nodes Arcs Initial State Goal State A conjunctive goal (“set of goals”) Regression of goal through action defn The goal of the planning problem A set of goals the planning problem’s initial description © Daniel S. Weld
Represents a set of world states Represents a set of world states Regression Regressing a goal, G, thru an action, A Yields the weakest precondition G’ Such that: if G’ is true before A is executed G is guaranteed to be true afterwards A G’ G precond effect Represents a set of world states Represents a set of world states © Daniel S. Weld
Regression Example A G G’ Disjunction preconditions precond effect (and (clear C) (on-table C) (arm-empty) (on A B)) (and (holding C) (on A B)) pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1) (on-table ?ob1) (arm-empty)) :effect (and (not (clear ?ob1)) (not (on-table ?ob1)) (not (arm-empty)) (holding ?ob1))) Disjunction preconditions © Daniel S. Weld
Conditional Effects © Daniel S. Weld
Regressing Conditional Effects precond effect (and (at briefcase bank) (in keys briefcase) (not (in paycheck briefcase)) (at paycheck bank)) (and (at keys home) (at paycheck bank)) bank home © Daniel S. Weld
Heuristics © Daniel S. Weld