Logic for Artificial Intelligence

Slides:



Advertisements
Similar presentations
REVIEW : Planning To make your thinking more concrete, use a real problem to ground your discussion. –Develop a plan for a person who is getting out of.
Advertisements

Language for planning problems
CSE391 – 2005 NLP 1 Planning The Planning problem Planning with State-space search.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.
Markov Decision Process
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Decision Theoretic Planning
Planning Methods based on Logic Propositional Calculus Predicate Calculus Situation Calculus Event Calculus.
MDP Presentation CS594 Automated Optimal Decision Making Sohail M Yousof Advanced Artificial Intelligence.
Planning CSE 473 Chapters 10.3 and 11. © D. Weld, D. Fox 2 Planning Given a logical description of the initial situation, a logical description of the.
1 Classical STRIPS Planning Alan Fern * * Based in part on slides by Daniel Weld.
An Introduction to Markov Decision Processes Sarah Hickmott
Planning under Uncertainty
Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.
Planning Russell and Norvig: Chapter 11. Planning Agent environment agent ? sensors actuators A1A2A3.
Markov Decision Processes CSE 473 May 28, 2004 AI textbook : Sections Russel and Norvig Decision-Theoretic Planning: Structural Assumptions.
4/1 Agenda: Markov Decision Processes (& Decision Theoretic Planning)
Planning: Part 1 Representation and State-space Search COMP151 March 30, 2007.
Department of Computer Science Undergraduate Events More
Classical Planning via State-space search COMP3431 Malcolm Ryan.
AI Principles, Lecture on Planning Planning Jeremy Wyatt.
Classical Planning Chapter 10.
Instructor: Vincent Conitzer
MAKING COMPLEX DEClSlONS
April 3, 2006AI: Chapter 11: Planning1 Artificial Intelligence Chapter 11: Planning Michael Scherger Department of Computer Science Kent State University.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
First-Order Logic and Plans Reading: C. 11 (Plans)
1 Chapter 16 Planning Methods. 2 Chapter 16 Contents (1) l STRIPS l STRIPS Implementation l Partial Order Planning l The Principle of Least Commitment.
MDPs (cont) & Reinforcement Learning
AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
1/16 Planning Chapter 11- Part1 Author: Vali Derhami.
Classical Planning Chapter 10 Mausam / Andrey Kolobov (Based on slides of Dan Weld, Marie desJardins)
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making Fully Observable MDP.
Department of Computer Science Undergraduate Events More
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
Planning I: Total Order Planners Sections
Markov Decision Process (MDP)
Planning in FOL Systems sequences of actions to achieve goals.
Consider the task get milk, bananas, and a cordless drill.
Computing & Information Sciences Kansas State University Wednesday, 04 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 17 of 42 Wednesday, 04 October.
Computing & Information Sciences Kansas State University Friday, 13 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 21 of 42 Friday, 13 October.
CLASSICAL PLANNING. Outline  The challenges in planning with standard search algorithm  Representing Plans – the PDDL language  Planning as state -
Intelligent Agents (Ch. 2)
Announcements Grader office hours posted on course website
Classical Planning via State-space search
Making complex decisions
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Introduction Contents Sungwook Yoon, Postdoctoral Research Associate
Consider the task get milk, bananas, and a cordless drill
Review for the Midterm Exam
Planning: Heuristics and CSP Planning
Markov Decision Processes
Markov Decision Processes
Planning José Luis Ambite.
Announcements Homework 3 due today (grace period through Friday)
Warning I estimate 3 weeks of intensive reading for you to understand your assigned topic These are not self contained and you will need to understand.
Planning CSE 573 A handful of GENERAL SEARCH TECHNIQUES lie at the heart of practically all work in AI We will encounter the SAME PRINCIPLES again and.
CS 188: Artificial Intelligence Fall 2007
13. Acting under Uncertainty Wolfram Burgard and Bernhard Nebel
Instructor: Vincent Conitzer
Markov Decision Problems
Chapter 17 – Making Complex Decisions
Graphplan/ SATPlan Chapter
CS 416 Artificial Intelligence
Markov Decision Processes
Markov Decision Processes
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Presentation transcript:

Logic for Artificial Intelligence Dynamics and Actions Logic for Artificial Intelligence Yi Zhou

Content Dealing with dynamics Production rules and subsumption architecture Situational calculus and AI planning Markov decision process Conclusion

Content Dealing with dynamics Production rules and subsumption architecture Situational calculus and AI planning Markov decision process Conclusion

Need for Reasoning about Actions The world is dynamic Agent needs to do actions Programs are actions

Content Dealing with dynamics Production rules and subsumption architecture Situational calculus and AI planning Markov decision process Conclusion

Production rules If condition then action

Subsumption Architecture

Content Dealing with dynamics Production rules and subsumption architecture Situational calculus and AI planning Markov decision process Conclusion

Formalizing Actions Pre-condition action Post-condition Pre-condition: conditions that hold before the action Post-condition: conditions that hold after the action Pre-requisites: conditions that must hold in order to execute the action Pre-condition vs Pre-requisites temp>20 vs temp is working

Situations,Actions and Fluents On(A,B) A is on B (eternally) On(A,B,S0) A is om B in situation S0 Holds(On(A,B),S0) On(A,B) ”holds” in situation S0 On(A,B) is called a Fluent Holds is s ”meta predicate” A fluent is a situation dependent predication. A situation or state may either be a start state e.g. S= S0, or the result of applying an action A in a state S S2 = do(S1,A)

Situation Calculus Notations Clear(u,s)  Holds(Clear(u),s) On(x,y,s)  Holds(On(x,y),s) Holds metapredicate On(x,y) Fluent S State Negative Effect axioms /Frame axioms are default (negation as failure)

SitCalc examples Actions : move(A,B,C) move block A from B to C Fluents: On(A,B) A is on B Clear(A) A is clear Predications: Holds(Clear(A),S0) A is clear in start state S0 Holds(On(A,B),S0) A is on B in S0 Holds(On(A,C),do(S0,move(A,B,C))) A is on C after move(A,B,C) is done in S0 Holds(Clear(A),do(S0,move(A,B,C))) A is (still) clear after moving move(A,B,C) in S0

Composite actions Holds(On(B,C), do(do(S0,move(A,B,Table),move(B,C)))) B is on C after starting in S0, and doing move(A,B,Table) , move(B,C) Alternative representation Holds(PlanResult( [move(A,B),move(B,C)], S0)

Using Resolution to find plan We can verify Holds(On(B,C), do(do(S0,move(A,B,Table),move(B,C)))) But we can also find a plan ?- Holds(On(B,C),X).  X =do(do(S0,move(A,B,Table),move(B,C))))

Frame, Ramification, Quantification Frame problem: what will remain unchanged after the action? Ramification problem: what will be implicitly changed after the action? Quantification problem: how many pre-requisites for an action?

AI Planning Languages Languages must represent.. Languages must be States Goals Actions Languages must be Expressive for ease of representation Flexible for manipulation by algorithms

State Representation A state is represented with a conjunction of positive literals Using Logical Propositions: Poor  Unknown FOL literals: At(Plane1,OMA)  At(Plan2,JFK) FOL literals must be ground & function-free Not allowed: At(x,y) or At(Father(Fred),Sydney) Closed World Assumption What is not stated are assumed false

Goal Representation Goal is a partially specified state A proposition satisfies a goal if it contains all the atoms of the goal and possibly others.. Example: Rich  Famous  Miserable satisfies the goal Rich  Famous

Action Representation At(WHI,LNK),Plane(WHI), Airport(LNK), Airport(OHA) Action Schema Action name Preconditions Effects Example Action(Fly(p,from,to), Precond: At(p,from)  Plane(p)  Airport(from)  Airport(to) Effect: At(p,from)  At(p,to)) Sometimes, Effects are split into Add list and Delete list Fly(WHI,LNK,OHA) At(WHI,OHA),  At(WHI,LNK)

Applying an Action Find a substitution list  for the variables of all the precondition literals with (a subset of) the literals in the current state description Apply the substitution to the propositions in the effect list Add the result to the current state description to generate the new state Example: Current state: At(P1,JFK)  At(P2,SFO)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO) It satisfies the precondition with ={p/P1,from/JFK, to/SFO) Thus the action Fly(P1,JFK,SFO) is applicable The new current state is: At(P1,SFO)  At(P2,SFO)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO)

Languages for Planning Problems STRIPS Stanford Research Institute Problem Solver Historically important ADL Action Description Languages See Table 11.1 for STRIPS versus ADL PDDL Planning Domain Definition Language Revised & enhanced for the needs of the International Planning Competition Currently version 3.1

State-Space Search (1) Search the space of states (first chapters) Initial state, goal test, step cost, etc. Actions are the transitions between state Actions are invertible (why?) Move forward from the initial state: Forward State-Space Search or Progression Planning Move backward from goal state: Backward State-Space Search or Regression Planning

State-Space Search (2)

State-Space Search (3) Remember that the language has no functions symbols Thus number of states is finite And we can use any complete search algorithm (e.g., A*) We need an admissible heuristic The solution is a path, a sequence of actions: total-order planning Problem: Space and time complexity STRIPS-style planning is PSPACE-complete unless actions have only positive preconditions and only one literal effect

STRIPS in State-Space Search STRIPS representation makes it easy to focus on ‘relevant’ propositions and Work backward from goal (using EFFECTS) Work forward from initial state (using PRECONDITIONS) Facilitating bidirectional search

Heuristic to Speed up Search We can use A*, but we need an admissible heuristic Divide-and-conquer: sub-goal independence assumption Problem relaxation by removing … all preconditions … all preconditions and negative effects … negative effects only: Empty-Delete-List

Heuristic to Speed up Search We can use A*, but we need an admissible heuristic Divide-and-conquer: sub-goal independence assumption Problem relaxation by removing … all preconditions … all preconditions and negative effects … negative effects only: Empty-Delete-List

Typical Planning Algorithms Search SatPlan, ASP plan Partial-order plan GraphPlan

AI Planning - Extensions Disjunctive planning Conformant planning Temporal planning Conditional planning Probabilistic planning … …

Content Dealing with dynamics Production rules and subsumption architecture Situational calculus and AI planning Markov decision process Conclusion

Decision Theory Probability Theory + Utility Theory = Decision Theory Describes what an agent should believe based on evidence. Describes what an agent wants. Describes what an agent should do. MDPs fall under the blanket of decision theory

Markov Assumption Markov Assumption: Markov Assumption: Andrei Markov (1913) Markov Assumption: The next state’s conditional probability depends only on a finite history of previous states kth order Markov Process Markov Assumption: The next state’s conditional probability depends only on its immediately previous state 1st order Markov Process The Markov assumption The definitions are equivalent!!! Any algorithm that makes the 1st order Markov Assumption can be applied to any Markov Process

Markov Decision Process The specification of a sequential decision problem for a fully observable environment that satisfies the Markov Assumption and yields additive costs.

Markov Decision Process An MDP has: A set of states S = {s1 , s2 , … sN} A set of actions A = {a1 , a2 , … aM} A real valued cost function g(s, a) A transition probability function p(s’ | s, a) Note: We will assume the stationary Markov transition property. This states that the effect of an action is independent of time

xk+1 = f(xk , μk(xk) ) k=0…N-1 Notation k indexes discrete time xk is the state of the system at time k; μk(xk) is the control variable to be selected given the system is in state xk at time k ; μk : Sk → Ak π is a policy; π = {μ0,,..., μN-1} π* is the optimal policy N is the horizon, or number of times the control is applied xk+1 = f(xk , μk(xk) ) k=0…N-1

Policy A policy is a mapping from states to actions Following a policy: 1. Determine current state xk 2. Execute action μk(xk) 3. Repeat 1-2

Solution to an MDP The expected cost of a policy π = {μ0,,..., μN-1} starting at state x0 is: Goal: Find the policy π* which specifies which action to take in each state, so as to minimise the cost function. This is encapsulated by Bellman’s Equation: A Markov Decision Process (MDP) is just like a Markov Chain, except the transition matrix depends on the action taken by the decision maker (agent) at each time step. The agent receives a reward, which depends on the action and the state. The goal is to find a function, called a policy, which specifies which action to take in each state, so as to maximize some function (e.g., the mean or expected discounted sum) of the sequence of rewards. One can formalize this in terms of Bellman's equation, which can be solved iteratively using policy iteration. The unique fixed point of this equation is the optimal policy.

Assigning Costs to Sequences The objective cost function maps infinite sequences of costs to single real numbers Options: Set a finite horizon and simply add the costs If the horizon is infinite, i.e. N → ∞, some possibilities are: Discount to prefer earlier costs Average the cost per stage

MDP Algorithms Value Iteration For each state select any initial value Jo(s) k=1 while k < maximum iterations For each state s find the action a that minimises the equation: Then assign μ(s) = a k = k+1 end

MDP Algorithms Policy Iteration Start with a randomly selected initial policy, then refine it repeatedly. Value Determination: solve |S| simultaneous Bellman equations Policy Improvement: for any state, if an action exists which reduces the current estimated cost, then change it in the policy. Each step of Policy Iteration is computationally more expensive than Value Iteration. However Policy Iteration needs fewer steps to converge than Value Iteration.

Content Dealing with dynamics Production rules and subsumption architecture Situational calculus and AI planning Markov decision process Conclusion

More approaches Decision theory, game theory Event calculus, fluent calculus POMDP Decision tree ……

Concluding Remarks Modeling dynamics and action selection is important Rule base approaches: production, subsumption architecture Classical logic based approaches: situation calculus, AI planning Probabilistic approaches: MDP, decision theory Game theoretical approaches

Thank you!