A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of.

Slides:

Advertisements

Similar presentations

REVIEW : Planning To make your thinking more concrete, use a real problem to ground your discussion. –Develop a plan for a person who is getting out of.

Advertisements

Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) Nov, 28, 2012.

CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.

Top 5 Worst Times For A Conference Talk 1.Last Day 2.Last Session of Last Day 3.Last Talk of Last Session of Last Day 4.Last Talk of Last Session of Last.

Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.

Plan Generation & Causal-Link Planning 1 José Luis Ambite.

Department of Computer Science Undergraduate Events More

SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.

Planning CSE 473 Chapters 10.3 and 11. © D. Weld, D. Fox 2 Planning Given a logical description of the initial situation, a logical description of the.

1 Classical STRIPS Planning Alan Fern * * Based in part on slides by Daniel Weld.

Happy Spring Break!. Integrating Planning & Scheduling Subbarao Kambhampati Scheduling: The State of the Art.

Planning under Uncertainty

CSE 471/598 Introduction to Artificial Intelligence (aka the very best subject in the whole-wide-world) The Class His classes are hard; He is not.

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Environment What action next? The $$$$$$ Question.

CPSC 322, Lecture 18Slide 1 Planning: Heuristics and CSP Planning Computer Science cpsc322, Lecture 18 (Textbook Chpt 8) February, 12, 2010.

Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.

Reinforcement Learning

Markov Decision Processes CSE 473 May 28, 2004 AI textbook : Sections Russel and Norvig Decision-Theoretic Planning: Structural Assumptions.

CSE 574: Planning & Learning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of the latter) Subbarao Kambhampati.

CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.

1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.

Planning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning Supervised.

Challenges in Adapting Automated Planning for Autonomic Computing Biplav Srivastava Subbarao Kambhampati IBM India Research Lab Arizona State University.

Nov 14 th  Homework 4 due  Project 4 due 11/26.

Handling non-determinism and incompleteness. Problems, Solutions, Success Measures: 3 orthogonal dimensions  Incompleteness in the initial state  Un.

Refinement Planing CSE 574 April 15, 2003 Dan Weld.

CPSC 322, Lecture 17Slide 1 Planning: Representation and Forward Search Computer Science cpsc322, Lecture 17 (Textbook Chpt 8.1 (Skip )- 8.2) February,

Planning Where states are transparent and actions have preconditions and effects Notes at

CPSC 322, Lecture 17Slide 1 Planning: Representation and Forward Search Computer Science cpsc322, Lecture 17 (Textbook Chpt 8.1 (Skip )- 8.2) February,

5/6: Summary and Decision Theoretic Planning  Last homework socket opened (two more problems to be added—Scheduling, MDPs)  Project 3 due today  Sapa.

CSE 574: Planning & Learning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of the latter) Subbarao Kambhampati.

Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.

Planning Where states are transparent and actions have preconditions and effects Notes at

272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.

CSE 573 Artificial Intelligence Dan Weld Peng Dai

1 Endgame Logistics  Final Project Presentations  Tuesday, March 19, 3-5, KEC2057  Powerpoint suggested ( to me before class)  Can use your own.

Computer Science CPSC 322 Lecture 3 AI Applications 1.

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.

Reinforcement Learning

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

MDPs (cont) & Reinforcement Learning

AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)

Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

1 CMSC 471 Fall 2004 Class #21 – Thursday, November 11.

1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.

1 CMSC 471 Fall 2002 Class #24 – Wednesday, November 20.

Planning I: Total Order Planners Sections

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 7, 2012.

Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.

Reinforcement Learning. Overview Supervised Learning: Immediate feedback (labels provided for every input). Unsupervised Learning: No feedback (no labels.

Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Introduction Contents Sungwook Yoon, Postdoctoral Research Associate

Review for the Midterm Exam

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Announcements Homework 3 due today (grace period through Friday)

Planning CSE 573 A handful of GENERAL SEARCH TECHNIQUES lie at the heart of practically all work in AI We will encounter the SAME PRINCIPLES again and.

Decision Theory: Sequential Decisions

Class #20 – Wednesday, November 5

CS 416 Artificial Intelligence

CS 188: Artificial Intelligence Spring 2006

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Class #17 – Tuesday, October 30

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Presentation transcript:

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati CSE 574 Planning & Learning (which is actually more of the former and less of the latter) Subbarao Kambhampati

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati The Indian Standard Time G Right now, it is 3:45AM in the morning in India –Where I was for the whole break »And only got back yesterday l And my body thinks it is still in India –I could never stay awake after 3AM – And the greedy Mariott closed the only half- decent coffee shop around here. So…. Wake me up if you see me dozing off

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Most everything Will be on Homepage

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Logistics G Office hours: After class 4:30-5:30 and by appointment G No “official TA” Romeo Sanchez And Binh Minh Do –Will kindly provide unofficial TA support G Caveats –Graduate level class. No text book—you will read papers. Participation required and essential G Evaluation (subject to change) –Participation (~20%) »Do readings before classes. Attend classes. Take part in discussions. Be scribes for class discussions. –Projects/Homeworks (~35%) »May involve using existing planners, writing new domains –Semester project (~20%) »Either a term paper or a code-based project –Mid-term and Final (~25%)

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Your introductions G Name G Standing G Area(s) of interest G Reasons if any for taking the course G Do you prefer –Homeworks/class projects OR –Semester long individual project?

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Planning : The big picture G Synthesizing goal-directed behavior G Planning involves –Action selection; Handling causal dependencies –Action sequencing and handling resource allocation »typically called SCHEDULING –Depending on the problem, plans can be »action sequences »or “policies” (action trees, state-action mappings etc.)

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati The Many Complexities of Planning Environment action perception Goals (Static vs. Dynamic) (Observable vs. Partially Observable) (perfect vs. Imperfect) (Deterministic vs. Stochastic) What action next? (Instantaneous vs. Durative) (Full vs. Partial satisfaction) The $$$$$$ Question

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Planning & (Classical Planning) Environment action perception Goals (Static) (Observable) (perfect) (deterministic) What action next? I = initial state G = goal state OiOi (prec)(effects) [ I ] OiOi OjOj OkOk OmOm [ G ]

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Static Deterministic ObservableInstantaneousPropositional “Classical Planning” Dynamic Replanning/ Situated Plans Durative Temporal Reasoning Continuous Numeric Constraint reasoning (LP/ILP) Stochastic Contingent/Conformant Plans, Interleaved execution MDP Policies POMDP Policies Partially Observable Contingent/Conformant Plans, Interleaved execution Semi-MDP Policies

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Class of 23 rd January I am less jet-lagged (waking up only at 3AM) I discovered Side-bar café (near law-library) Even started sadism (homework assignments) In short—general sweetness and light all around

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Applications (Current & Potential) G Scheduling problems with action choices as well as resource handling requirements –Problems in supply chain management –HSTS (Hubble Space Telescope scheduler) –Workflow management G Autonomous agents –RAX/PS (The NASA Deep Space planning agent) G Software module integrators –VICAR (JPL image enhancing system); CELWARE (CELCorp) –Test case generation (Pittsburgh) G Interactive decision support –Monitoring subgoal interactions »Optimum AIV system G Plan-based interfaces –E.g. NLP to database interfaces –Plan recognition G Web-service composition

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Lots of activity... G Significant scale-up in the last 4-5 years –Before we could synthesize about 5-6 action plans in minutes –Now, we can synthesize 100-action plans in minutes »Further scale-up with domain-specific control G Significant strides in our understanding –Rich connections between planning and CSP(SAT) OR (ILP) »Vanishing separation between planning & Scheduling –New ideas for heuristic control of planners –Wide array of approaches for customizing planners with domain-specific knowledge New people. Conferences. Workshops. Competitions. Inter-planetary explorations. So, Why the increased interest?

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Broad Aims & Biases of the First Part AIM: We will concentrate on planning in deterministic, quasi-static and fully observable worlds Will start with “classical” domains; but discuss handling durative actions and numeric constraints, as well as replanning Neo-Classical Planning BIAS: To the extent possible, we shall shun brand-names and concentrate on unifying themes Better understanding of existing planners Normalized comparisons between planners Evaluation of trade-offs provided by various design choices Better understanding of inter-connections Hybrid planners using multiple refinements Explication of the connections between planning, CSP, SAT and ILP

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Overview for the first part  The Planning problem –Our focus – Modeling, Proving correctness  Refinement Planning: Formal Framework  Conjunctive refinement planners  Disjunctive refinement planners –Refinement of disjunctive plans –Solution extraction from disjunctive plans »Direct, Compiled (SAT, CSP, ILP,BDD)  Heuristics/Optimizations  Customizing Planners –User-assisted Customization –Automated customization  Support for non-classical worlds

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Why Care about “classical” Planning? G Most of the recent advances occurred in neo-classical planning G Many stabilized environments satisfy neo-classical assumptions –It is possible to handle minor assumption violations through replanning and execution monitoring “ This form of solution has the advantage of relying on widely-used (and often very efficient) classical planning technology” Boutilier, 2000 G Techniques developed for neo-classical planning often shed light on effective ways of handling non-classical planning worlds –Currently, most of the efficient techniques for handling non-classical scenarios are still based on ideas/advances in classical planning

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati “..As such, the classcial model can b viewed as a way of approximating the solution of the underlying POMDP. […] This form of solution has the advantage of relying on widely-used (and often very efficient) classical planning technology” Also put some of the classification stuff?

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati The (too) many brands of classical planners Planning as Search Search in the space of States (progression, regression, MEA) (STRIPS, PRODIGY, TOPI, HSP, HSP-R, UNPOP, FF) Search in the space of Plans (total order, partial order, protections, MTC) (Interplan,SNLP,TOCL, UCPOP,TWEAK) Search in the space of Task networks (reduction of non-primitive tasks) (NOAH, NONLIN, O-Plan, SIPE) Planning as CSP/ILP/SAT/BDD (Graphplan, IPP, STAN, SATPLAN, BLackBOX,GP-CSP,BDDPlan) Planning as Theorem Proving (Green’s planner) Planning as Model Checking

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati A Unifying View CONTROL Heuristics/Optimizations Reachability Relevance Relax Subgoal interactions Directed Partial Consistency enforcement PART 2 HTN Schemas TL Formulas Cutting Planes Domain-customization Case-based Abstraction-based Failure-based Domain Analysis* Hand-coded Learned PART 3 Refinement Planning Disjunctive Refinement Planning Conjunctive Refinement Planning CSPILPBDD What are Plans? Refinements? How are sets of plans represented compactly? How are they refined? How are they searched? Graph-basedSAT SEARCH FSS, BSS, PS Candidate set semantics PART I

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Modeling Planning Problems: Actions, States, Correctness PART I.0

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Transition Sytem Perspective G We can think of the agent-environment dynamics in terms of the transition systems –A transition system is a 2-tuple where »S is a set of states »A is a set of actions, with each action a being a subset of SXS –Transition systems can be seen as graphs with states corresponding to nodes, and actions corresponding to edges »If transitions are not deterministic, then the edges will be “hyper-edges”—i.e. will connect sets of states to sets of states –The agent may know that its initial state is some subset S’ of S »If the environment is not fully observable, then |S’|>1. »|S’| can be > 1 even in fully-observable domains (if we want to do find policies rather than plans) –It may consider some subset Sg of S as desirable states –Finding a plan is equivalent to finding (shortest) paths in the graph corresponding to the transition system

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Transition System Models A transition system is a two tuple Where S is a set of “states” A is a set of “transitions” each transition a is a subset of SXS --If a is a (partial) function then deterministic transition --otherwise, it is a “non-deterministic” transition --It is a stochastic transition If there are probabilities associated with each state a takes s to --Finding plans becomes is equivalent to finding “paths” in the transition system Transition system models are called “Explicit state-space” models In general, we would like to represent the transition systems more compactly e.g. State variable representation of states. These latter are called “Factored” models Each action in this model can be Represented by incidence matrices (e.g. below) The set of all possible transitions Will then simply be the SUM of the Individual incidence matrices

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Manipulating Transition Systems

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati MDPs as general cases of transition systems G An MDP (Markov Decision Process) is a general (deterministic or non-deterministic) transition system where the states have “Rewards” –In the special case, only a certain set of “goal states” will have high rewards, and everything else will have no rewards –In the general case, all states can have varying amount of rewards G Planning, in the context of MDPs, will be to find a “policy” (a mapping from states to actions) that has the maximal expected reward G We will talk about MDPs later in the semester

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Problems with transition systems G Transition systems are a great conceptual tool to understand the differences between the various planning problems G …However direct manipulation of transition systems tends to be too cumbersome –The size of the explicit graph corresponding to a transition system is often very large (see Homework 1 problem 1) –The remedy is to provide “compact” representations for transition systems »Start by explicating the structure of the “states” l e.g. states specified in terms of state variables »Represent actions not as incidence matrices but rather functions specified directly in terms of the state variables l An action will work in any state where some state variables have certain values. When it works, it will change the values of certain (other) state variables