Section 10 Mid-term Review II. Topics Brew the coffee! Three operators: 1. load(x) precond:coffee(x), loaded(none) effects:loaded(x), ¬loaded(none) 2.

Slides:



Advertisements
Similar presentations
Decision Theory: Sequential Decisions Computer Science cpsc322, Lecture 34 (Textbook Chpt 9.3) Nov, 28, 2012.
Advertisements

Planning II: Partial Order Planning
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Graphplan. Automated Planning: Introduction and Overview 2 The Dock-Worker Robots (DWR) Domain informal description: – harbour with several locations.
Constraint Based Reasoning over Mutex Relations in Graphplan Algorithm Pavel Surynek Charles University, Prague Czech Republic.
CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.
Reinforcement Learning (II.) Exercise Solutions Ata Kaban School of Computer Science University of Birmingham 2003.
Plan Generation & Causal-Link Planning 1 José Luis Ambite.
Graph-based Planning Brian C. Williams Sept. 25 th & 30 th, J/6.834J.
ARTIFICIAL INTELLIGENCE [INTELLIGENT AGENTS PARADIGM] Professor Janis Grundspenkis Riga Technical University Faculty of Computer Science and Information.
1 Chapter 16 Planning Methods. 2 Chapter 16 Contents (1) l STRIPS l STRIPS Implementation l Partial Order Planning l The Principle of Least Commitment.
An Introduction to Markov Decision Processes Sarah Hickmott
Reinforcement Learning & Apprenticeship Learning Chenyi Chen.
Planning under Uncertainty
Decision Theory: Single Stage Decisions Computer Science cpsc322, Lecture 33 (Textbook Chpt 9.2) March, 30, 2009.
Fast Planning through Planning Graph Analysis By Jan Weber Jörg Mennicke.
CPSC 322, Lecture 18Slide 1 Planning: Heuristics and CSP Planning Computer Science cpsc322, Lecture 18 (Textbook Chpt 8) February, 12, 2010.
Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.
Planning: Part 3 Planning Graphs COMP151 April 4, 2007.
1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.
Markov Decision Processes
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 7- 1 Homework, Page 626 Find (a) A + B, (b) A - B, (c) -2A, and (d)
Game Theory.
Simulation.
M ANAGERIAL ECONOMI X. CONTENTS… DEFINITION T.E.T V/S M.E.T WAT’S NEW IN MET PROBLEMS IN M.E.T MET & STATISTICS MET & O.R MET & MANAGEMENT ACCOUNTING.
Classical Planning Chapter 10.
For Wednesday Read chapter 12, sections 3-5 Program 2 progress due.
DISCRETE PROBABILITY DISTRIBUTIONS
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Reinforcement Learning (II.) Exercise Solutions Ata Kaban School of Computer Science University of Birmingham.
Homework 1 ( Written Portion )  Max : 75  Min : 38  Avg : 57.6  Median : 58 (77%)
Decision Making in Robots and Autonomous Agents Decision Making in Robots and Autonomous Agents The Markov Decision Process (MDP) model Subramanian Ramamoorthy.
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
Solving Large Markov Decision Processes Yilan Gu Dept. of Computer Science University of Toronto April 12, 2004.
Efficient Solution Algorithms for Factored MDPs by Carlos Guestrin, Daphne Koller, Ronald Parr, Shobha Venkataraman Presented by Arkady Epshteyn.
Computing & Information Sciences Kansas State University Lecture 22 of 42 CIS 530 / 730 Artificial Intelligence Lecture 22 of 42 Planning: Sensorless and.
Utilities and MDP: A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor SIUC.
Computing & Information Sciences Kansas State University Lecture 21 of 42 CIS 530 / 730 Artificial Intelligence Lecture 21 of 42 Planning: Graph Planning.
3.1 Solving equations by Graphing System of equations Consistent vs. Inconsistent Independent vs. Dependent.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Investigating Patterns Cornell Notes & Additional Activities.
1 Chapter 16 Planning Methods. 2 Chapter 16 Contents (1) l STRIPS l STRIPS Implementation l Partial Order Planning l The Principle of Least Commitment.
Automated Planning Dr. Héctor Muñoz-Avila. What is Planning? Classical Definition Domain Independent: symbolic descriptions of the problems and the domain.
Introduction to Planning Dr. Shazzad Hosain Department of EECS North South Universtiy
The Stock Market 3.1 STOCK MARKET BASICS. Objectives.
College Algebra Sixth Edition James Stewart Lothar Redlin Saleem Watson.
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Decision Making Under Uncertainty CMSC 471 – Spring 2041 Class #25– Tuesday, April 29 R&N, material from Lise Getoor, Jean-Claude Latombe, and.
CPSC 322, Lecture 2Slide 1 Representational Dimensions Computer Science cpsc322, Lecture 2 (Textbook Chpt1) Sept, 7, 2012.
1 of 24 ©2012 McGraw-Hill Ryerson Limited Learning Objectives 1.Justify management’s decision criteria as to whether internally generated funds should.
1 Automated Planning and Decision Making 2007 Automated Planning and Decision Making Prof. Ronen Brafman Various Subjects.
Computing & Information Sciences Kansas State University Wednesday, 04 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 17 of 42 Wednesday, 04 October.
Chapter 5: Matrices and Determinants Section 5.5: Augmented Matrix Solutions.
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
Computing & Information Sciences Kansas State University Friday, 13 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 21 of 42 Friday, 13 October.
Planning as Search State Space Plan Space Algorihtm Progression
Making complex decisions
Copyright © Cengage Learning. All rights reserved.
Review for the Midterm Exam
Decision Theory: Single Stage Decisions
Planning: Heuristics and CSP Planning
Decision Theory: Sequential Decisions
Decision Theory: Single Stage Decisions
Relations and Functions
Probability and Time: Markov Models
Systems of Equations.
Graphplan/ SATPlan Chapter
Graphplan/ SATPlan Chapter
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Presentation transcript:

Section 10 Mid-term Review II

Topics

Brew the coffee! Three operators: 1. load(x) precond:coffee(x), loaded(none) effects:loaded(x), ¬loaded(none) 2. brew(x) precond:loaded(x), ¬loaded(none), ¬loaded(waste) effects: ¬loaded(x), loaded(waste), pot(x) 3. unload(x) precond: loaded(x), ¬loaded(none) effects: ¬loaded(x), loaded(none) Two types of coffee: caf & decaf; waste; none Initial state: coffee(caf), coffee(decaf), loaded(none) Goal state: pot(caf), pot(decaf)

Graphplan! (Problem 1) Graphplan works only for propositional planning problems! Core elements: Expand-Graph keep track of mutex action and propositions Extract-Solution

Propositionalize the PDDL Eliminate variables by replacing them with constant symbols Example of propositionalized fluents: loadedCaf:loaded(caf) Example of propositionalized actions: brewCaf:brew(caf) precond: loaded(caf), ¬loaded(none), ¬loaded(waste) effects: ¬loaded(caf), loaded(waste), pot(caf) Propositionalized initial state: coffeeCaf, coffeeDecaf, loadedNone Propositionalized goal state: potCaf, potDecaf

Expand the Graph coffeeCaf coffeeDecaf loadedNone loadCaf loadDecaf coffeeCaf coffeeDecaf loadedNone coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedNone ¬loadedNone P 0 A 1 P 1

Keep track of the Mutex Mutex actions Not independent Action A deletes Action B’s precondition Action A deletes Action B’s positive effect Any of the precondition pairs are mutex Mutex propositions All producer pairs are mutex

Mutex Actions and Propositions Mutex actions in A 1 : (loadCaf, loadDecaf) (loadCaf, loadedNone) (loadDecaf, loadedNone) Mutex propositions in P 1 : (loadedCaf, loadedDecaf) (loadedCaf, loadedNone) (loadedDecaf, loadedNone) (loadedNone, ¬loadedNone)

Continue Expand the Graph brewCaf loadDecaf coffeeCaf coffeeDecaf loadedNone loadCaf brewDecaf unloadCaf unloadDecaf loadedCaf loadedDecaf PotCaf PotDecaf coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedWaste loadedNone ¬loadedNone coffeeCaf coffeeDecaf loadedNone loadCaf loadDecaf coffeeCaf coffeeDecaf loadedNone coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedNone ¬loadedNone

Extract Solution Graphplan starts to extract solution iff All goal state fluents appear in a proposition level None of the goal state fluent pairs is mutex Extract the solution Graphplan gives you a valid plan, but not necessarily an optimal one (with the minimum number of actions) Multiple actions can take place in one action level!

Partial-Order Planning (Problem 2) Causal links Action A: Action B: precond:…precond:p(y), … effects:p(x), …effects:… A—p—B! Threats Action C: precond:… effects: ¬p(z), … C is a threat to the A—B causal link!

Causal Links and Threats Causal Links Example load(x)—loaded(x)—brew(x) Threats Example unload(x) could be a threat to the causal link above!

Demotion and Promotion A—p(x)—B, C is a threat to this causal link Demotion:C—A—B Promotion:A—B—C load(x)—loaded(x)—brew(x) is a causal link, unload(x) is a threat to this causal link Demotion:unload(x 1 )—load(x 2 )—brew(x 3 ) possible variable bindings:x 1 =waste, x 2 = x 3 =decaf Promotion:load(x 1 )—brew(x 2 )—unload(x 3 ) possible variable bindings:x 1 = x 2 =decaf, x 3 =waste

HTN (Problem 3) Serve_two_things(t) task:serve_coffee_and_cake(t) precond:table(t) subtasks:serve(coffee,t), serve(cake,t) Serve_coffee(x, t) task:serve(x,t) precond:coffee(x), table(t) subtasks:make-coffee(x), move(x, t) Serve_cake(x, t) task:serve(x,t) precond:cake(x), table(t) subtasks:make-cake(x), move(x, t)

HTN(cont’d) Make-Caf-Coffee(x, b, m) task:make-coffee(x) precond:bean(b), caf-bean(b), coffee-maker(m), coffee(x) subtasks:load(b, m), brew(b, m, x) Make-Decaf-Coffee(x, b, m) task:make-coffee(x) precond:bean(b), decaf-bean(b), coffee-maker(m), coffee(x) subtasks:load(b, m), brew(b, m, x) Load(b, m) [Primitive task!] precond:bean(b), coffee-maker(m), unloaded(m) effects:loaded(b, m) Brew(b, m, x) [Primitive task!] precond:loaded(b, m), bean(b), coffee-maker(m) effects:coffee(x), in(x, m)

serve_coffee_and_cake (t 0 ) Serve_two_things(t 0 ) table(t 0 ) serve(coffee) serve(cake) Serve_coffee(coffee, t 0 ) coffee(coffee), table(t 0 ) make-coffee(coffee) move(coffee, t 0 ) Make-Caf-Coffee(coffee, caf-bean, machine) Make-Decaf-Coffee(coffee, decaf-bean, machine)

MDP (Problem 4) You are making a three-year investment plan now. After your research, you find there are two companies which you’re interested in investing: Boston Medicine and San Francisco Chips. Currently the stock price is $10 per share for Boston Medicine and $12 per share for San Francisco Chips. At the beginning of each year, you will decide which company to invest in, and once you make the decision, you will buy 1000 shares from that company. At the end of each year, you will earn / loss money depending on whether the stock price of the company you invest goes up or down.

MDP (Problem 4) Particularly, the stock prices change according to the following transition matrices: For Boston Medicine: For San Francisco Chips: End of Year Price: $5 End of Year Price: $10 End of Year Price: $15 Current Price: $540% 20% Current Price: $1025%50%25% Current Price: $1530%40%30% End of Year Price: $10 End of Year Price: $12 End of Year Price: $14 Current Price: $1020%60%20% Current Price: $1220%70%10% Current Price: $1415%60%25%

MDP (Problem 4) States? – Actions? – BM, SFC Rewards? –  (currPriceBM-prevPriceBM)*1000 –  (currPriceSFC - prevPriceSFC)*1000

MDP (Problem 4) Transitions? 0.4*0.20.4*0.60.4* *0.60.4*0.20.2*0.20.2*0.60.2* *0.20.4*0.70.4*0.10.4*0.20.4*0.70.4*0.10.2*0.20.2*0.70.2* * *0.60.3* * *0.60.4* * *0.60.3*0.25

Logic-Based vs. Decision-Theory Decision theory: – Utilities (rewards) – Uncertainties (transition probabilities) – View the world as states – Policy defines: given a state, which action to take Logic based (propositional, PDDL) – Goal state we want to reach – Actions with preconditions and deterministic affects – Factored state representation – In HTNs, Hierarchical representation of tasks

Which approach would you use? What approach would you use to model each of the following planning problems? If both options seem reasonable, explain the advantages and limitations of each: – Planning how your team should work on a class project – Programing a robot that participates in RoboCup – Deciding where to eat on campus every day

Other Questions Assume that we wanted to model what to eat in the dining room every day using an MDP. We defined the states as the available options, and we defined rewards based on our food preferences and taking into account other considerations as not wanting to eat the same food for two days in a row. – How would you go about defining the transition function? – If we use an optimal algorithm like value iteration to solve our MDP, are we guaranteed to have the optimal policy?