Module 5 PLANNING AND LEARNING.

Module 5 PLANNING AND LEARNING

5.1 PLANNING

Planning The task of coming up with a sequence of actions that will achieve a goal When environment is accessible, goal is known The agent must be able to deal with the case when goal is infeasible or action plan is empty Planning involves Division of complex problem to simpler Identify objects and their description Identify operators and predicates Plan the solution Planning is same as problem solving at an abstract level SHIWANI GUPTA

Planning Involves Given knowledge about task domain (actions)
Room 2 Given knowledge about task domain (actions) Given problem specified by initial state configuration and goals to achieve Agent tries to find a solution, i.e. a sequence of actions that solves a problem Agent Room 1 SHIWANI GUPTA

Notions Room 2 Go to the basket Go to the can Plan sequence of actions transforming the initial state into a final state Operators represent actions Planner algorithm generates a plan from a (partial) description of initial and final state and from a specification of operators Room 1 SHIWANI GUPTA

Problem Solving Agent + Knowledge Based agent A SIMPLE PLANNING AGENT
(Generate sequences of actions to perform tasks and achieve objectives) Problem Solving agent:-to consider the consequences of sequences of actions before acting. Knowledge Base Agent:-can select actions based on explicit logical representations of the current state and the effects of actions. A SIMPLE PLANNING AGENT 1. Generate a goal to achieve 2. Construct a plan to achieve goal from current state 3. Execute plan until finished 4. Begin again with new goal Problem solving agent: search algo Knowledge based agent: logic SHIWANI GUPTA

SHIWANI GUPTA

STATE-DESCRIPTION: uses a percept as input and returns the description of the initial state in a format required for the planner. IDEAL-PLANNER: is the planning algorithm and can be any planner MAKE-GOAL-QUERY: asks the knowledge base what the next goal will be. The agent in the above algorithm has to check if the goal is feasible or if the complete plan is empty. If the goal is not feasible, it ignores it and tries another goal. If the complete plan was empty then the initial state was also the goal state. SHIWANI GUPTA

Assumptions Atomic time: each action is indivisible
No concurrent actions allowed Deterministic actions: result of each actions is completely determined by the definition of the action, and there is no uncertainty in performing it in the world. Agent is the sole cause of change in the world. Agent is omniscient: has complete knowledge of the state of the world Closed world assumption: everything known to be true in the world is included in a state description. Anything not listed is false SHIWANI GUPTA

Problem Solving and Planning
Similarity Constructs plans that achieve goal and executes them Dissimilarity Initial state=initial situation Goal test predicate=goal state description Successor fn computed from set of operators once a goal is found, solution plan is the sequence of operators in the path from the start node to the goal node SHIWANI GUPTA

Basic elements of problem-solving
representation of actions Programs that generate successor state description representing actions representation of states Position of boat and 6 people in missionaries and cannibal problem Permutation of pieces in 8 puzzle problem representation of goals representation of plans SHIWANI GUPTA

Example: Shopping problem
“Get a quart of milk, a bunch of banana and a variable-speed cordless drill” need to define initial state: the agent is at home without any objects that he is wanting. Operator set: everything the agent can do. Heuristic function: the # of things that have not yet been acquired. SHIWANI GUPTA

• It is evident from the above figure that the actual branching factor would be in the thousands or millions. The heuristic evaluation function can only choose states to determine which one is closer to the goal. It cannot eliminate actions from consideration. The agent makes guesses by considering actions and the evaluation function ranks those guesses. The agent picks the best guess, but then has no idea what to try next and therefore starts guessing again. • It considers sequences of actions beginning from the initial state. The agent is forced to decide what to do in the initial state first, where possible choices are to go to any of the next places. Until the agent decides how to acquire the objects, it can't decide where to go. SHIWANI GUPTA

Problem solver in Figure above
too many branches too many actions, states heuristic evaluation function Problem-Solving agent consider sequence of actions from the initial state decide what to do in the initial state when given relevant choices it cannot decide where to go until the agent figures out how to obtain items Planning agent “Open up” the representation of states, goals and operators Add actions to the plan wherever they are needed most goals of the world are independent of most other parts divide-and-conquer strategy SHIWANI GUPTA

Situation Calculus Situation calculus is a version of first-order-logic (FOL) that is augmented so that it can reason about actions in time. A situation is a snapshot of the world at an interval of time when nothing changes Add a special predicate holds(f,s) that means "f is true in situation s" Add a function result(a,s) that maps the current situation s into a new situation as a result of performing action a. SHIWANI GUPTA

Planning in Situation Calculus
A planning problem represented in situation calculus by logical sentences initial state: For shopping problem At(Home,s0)  ¬Have(Milk, s0)  ¬Have(Banana, s0)  ¬Have(Drill,s) goal state: a logical query s At(Home,s)  Have(Milk,s)  Have(Bananas,s)  Have(Drill,s) operators: description of actions a,s Have(Milk,Result(a,s))  [(a=Buy(Milk)  At(Supermarket,s)  (Have(Milk,s)  a  Drop(Milk))] s Result’([],s)=s Result’([],s) means result from sequence of actions starting in s. a,p,s Result’([a|p],s)=Result’(p,Result(a,s)) p is a plan when applied to start state develops a situation which satisfies the goal query. SHIWANI GUPTA

A solution to the shopping problem is a plan P
At(Home, Result’(P,s0))  Have(Milk,Result’(P, s0))  Have(Bananas, Result’(P, s0)  Have(Drill,Result’ (P, s0)) So, yields a situation satisfying the goal query : P=[Go(Supermarket), Buy(Bananas), Go(HardwareStore), Buy(Drill), Go(Home)] To make planning practical (1) Restrict the language (2) use a special-purpose algorithm SHIWANI GUPTA

Basic Representations for planning
STRIPS (Stanford Research Institute Problem Solver): Representation for state and goals initial state for the shopping problem At(Home)  ¬Have(Milk)  ¬Have(Banana)  ¬Have(Drill)  …… incomplete state description goal At(Home)  Have(Milk)  Have(Banana)  Have(Drill) It can contain variables At(x)  Sells(x,Milk) The initial and goal state are input to planning systems SHIWANI GUPTA

Representation for actions STRIP operations consist of
action description precondition effect/post condition eg. Op(ACTION:Go(there), PRECOND:At(here)  Path(here,there), EFFECT:At(there)  ¬At(here)) SHIWANI GUPTA

Situation space and plan space
progression planner : forward search regression planner : backward search expand it with a complete plan that solves the problem  plan space Refinement operators take a partial plan and add constraints to it Search through the space of situations and Search through the space of plans Representation for plans To search through a space of plans eg. “putting on a pair of shoes” goal : RightShoeOn  LeftShoeOn initial state : no literal operators: Op(ACTION:RightShoe,PRECOND:RightSockOn,EFFECT:RightShoeOn) Op(ACTION:RightSock,EFFECT:RightSockOn) Op(ACTION:LeftShoe,PRECOND:LeftSockOn,EFFECT:LeftShoeOn) Op(ACTION:LeftSock,EFFECT:LeftSockOn) SHIWANI GUPTA

A plan is defined as a data structure A set of plan steps
A set of step ordering A set of variable binding constraints A set of causal links : si c sj ”precondition c of sj is achieved by si” initial plan before any refinements Start < Finish Refine and manipulate until a plan that is a solution - Initial plan SHIWANI GUPTA

SHIWANI GUPTA

Strict order plan SHIWANI GUPTA

Solutions solution : a plan that an agent guarantees goal achievement
a solution is a complete and consistent plan a complete plan : every precondition of every step is achieved by some other step a consistent plan : no contradictions in the ordering or binding constraints. (a consistent plan with no open preconditions – solution) When we meet an inconsistent plan we backtrack and try another branch SHIWANI GUPTA

Components of a Partial Order Plan
A set of actions: making up the steps of plan A set of ordering constraints: an action to be executed sometimes before another action A set of causal links: an action to be immediately executed before another action A set of open pre conditions: not achieved by some action SHIWANI GUPTA

A partial-order planning example
Shopping problem: “get milk, banana, drill and bring them back home” assumption 1)Go action can travel the two locations 2)no need money initial state : operator start Op(ACTION:Start, EFFECT:At(Home)  Sells(HWS,Drill)  Sells(SM,Milk), Sells(SM,Banana)) goal state : Finish Op(ACTION:Finish, PRECOND:Have(Drill)  Have(Milk)  Have(Banana)  At(Home)) actions: Op(ACTION:Go(there), PRECOND:At(here), EFFECT:At(there)  ¬At(here)) Op(ACTION:Buy(x), PRECOND:At(store)Sells(store,x), EFFECT:Have(x)) sadguru SHIWANI GUPTA

START is an action with no precondition and effect as initial state
FINISH is an action with precondition as goal state and effect is NIL There are many possible ways in which the initial plan is elaborated one choice : three Buy actions for three preconditions of Finish action second choice : sells precondition of Buy • Bold arrows : causal links, protection of precondition • Light arrows : ordering constraints Super market Hardware store SHIWANI GUPTA

Causal links are used to protect the steps that were used to achieve a precondition
Causal link b/w buy(drill) and Have(drill) was added, coz if Have(drill) is deleted, planner ensures it won’t go b/w buy(drill) and finish No need to worry about ordering of actions SHIWANI GUPTA

demotion : placed before promotion : placed after
causal links : protected links a causal link is protected by ensuring that threats are ordered to come before or after the protected link demotion : placed before promotion : placed after To achieve At condition of the Buy action Agent can't be at 2 places at once Resolve the threat before continuing SHIWANI GUPTA

SHIWANI GUPTA

Partial-order planning algorithm
SHIWANI GUPTA

Planning in the Blocks World
SHIWANI GUPTA

Knowledge Engineering for Planning
Methodology for solving problems with the planning approach (1) Decide what to talk about (2) Decide on a vocabulary of conditions, operators, objects (3) Encode operators for the domain (4) Encode a description of the specific problem instance (5) Pose problems to the planner and get back plans SHIWANI GUPTA

¬  x On(x,b) or x ¬On(x,b) : precondition
eg. The blocks world (1) What to talk about cubic blocks sitting on a table one block on top of another A robot arm picks up a block and moves it to another position (2) Vocabulary / predicates (objects: blocks and table) ontable(x): block x is directly on table on(b,x) : block x is directly on block y holding(x): robot arm is holding block x handempty: robot arm is empty clear(x): block x has clear top (3) Operators: condition action rules Op(ACTION:Move(b,x,y), PRECOND:On(b,x)  Clear(b)  Clear(y), EFFECT:On(b,y)  Clear(x)  ¬On(b,x)  ¬Clear(y)) Op(ACTION:MoveToTable(b,x), PRECOND:On(b,x)  Clear(b), EFFECT:On(b,Table)  Clear(x)  ¬On(b,x)) ¬  x On(x,b) or x ¬On(x,b) : precondition There is nothing on a block SHIWANI GUPTA

START STATE (4) Encode an instance of the specific problem instance
Initial State: ontable(B) clear(B) ontable(A) on(C,A) clear( C) handempty SHIWANI GUPTA

Goal State: ontable(C) on(B,C)  on(A,B) clear( A) handempty
SHIWANI GUPTA

effect The classical Action Procedures SHIWANI GUPTA

Final triangular Representation of a Plan SHIWANI GUPTA
Every new row operator has prerequisite/precondition as prev row Every action has precondition as prev col (vidyalankar-sagar kadam) SHIWANI GUPTA

What is a Planning Problem?
A planning problem is given by: an initial state and a goal state. Ontable (B) Ontable (C) On (D,B) On (A,D) Clear (A) Clear (C) Handempty A GOAL: D B C For a transition there are certain operators available. PICKUP (x) picking up x from the table PUTDOWN (x) putting down x on the table STACK (x, y) putting x on y UNSTACK (x, y) picking up x from y  in blocks world - Formalise Operators! - Find a plan! SHIWANI GUPTA

STRIPS Domain Descriptions
Planning problem: Initial state, goal conditions, set of operators Solution: A sequence of ground operator instances that produces the goal from the initial state STRIPS Assumption: literals not mentioned remain unchanged. ( The Frame Problem ) SHIWANI GUPTA

Need to describe both what changes and what doesn‘t change
The “Frame Problem“ Need to describe both what changes and what doesn‘t change One of the earliest solutions to the frame problem was the STRIPS planning algorithm Specialized planning algorithm rather than general purpose theorem prover Leaves facts unchanged from one state to the next unless a planning operator explicitly changes them SHIWANI GUPTA

STRIPS Language (without negation)
A subset of first-order logic: predicate symbols (chosen for the particular domain) constant symbols (chosen for the particular domain) variable symbols no function symbols! Atom: expression p(t1, ..., tn) p is a predicate symbol each t1 is a term SHIWANI GUPTA

STRIPS Language (with negation)
Literal: Is an atom p(t1, ..., tn), called a positive literal or a negated atom ~ p(t1, ..., tn), called a negative literal A conjunct is represented either by a comma or a  p1(t1, ..., tn), ~ p2(t1, ..., tn), p3(t1, ..., tn) For now, we won’t have any disjunctions, implications, or quantifiers SHIWANI GUPTA

Representing States of the World
State: a consistent assignment of TRUE or FALSE to every literal in the universe State description: a set of ground literals that are all taken to be TRUE c on(c,a),ontable(a),clear(c), ontable(b),clear(b),handempty a b The negation of these literals are taken to be false Truth values of other ground literals are unknown Note: in standard STRIPS, a state is restricted to contain only positive literals SHIWANI GUPTA

STRIPS Operators (with negation)
A STRIPS operator : name(v1, v2, ..., vn) Preconditions: atom1, atom2, ..., atomn Effects: literal1, literal2, ..., literalm unstack(?x,?y) Preconditions: on(?x,?y), clear(?x), handempty Effects: ~on(?x,?y), ~clear(?x), ~handempty, holding(?x), clear(?y) Example: Operator Instance: replacement of variables by constants SHIWANI GUPTA

STRIPS Operators Ground instance: replace all variables by constants
unstack(c,a) Preconditions: on(c,a), clear(c), handempty Effects: ~on(c,a), ~clear(c), ~handempty, holding(c), clear(a) If all preconditions of a ground instance are true (i.e., they occur) in a state description S, then O is applicable to S Applying O to S produces the successor state description: Result(S,O) = (S – Del(O))  Effects(O) c c on(c,a), ontable(a), clear(c), ontable(b), clear(b),handempty a b unstack(c,a) Preconditions: on(c,a), clear(c), handempty Effects: ~on(c,a), ~clear(c), ~handempty, holding(c), clear(a) ontable(a), ontable(b), clear(b), ~on(c,a), ~clear(c), ~handempty, holding(c), clear(a) SHIWANI GUPTA

Example: The Blocks World
unstack(?x,?y) Pre: on(?x,?y), clear(?x), handempty Eff: ~on(?x,?y), ~clear(?x), ~handempty, holding(?x), clear(?y) a b c stack(?x,?y) Pre: holding(?x), clear(?y) Eff: ~holding(?x), ~clear(?y), on(?x,?y), clear(?x), handempty b a c pickup(?x) Pre: ontable(?x), clear(?x), handempty Eff: ~ontable(?x), ~clear(?x), ~handempty, holding(?x) a c b putdown(?x) Pre: holding(?x) Eff: ~holding(?x), ontable(?x), clear(?x), handempty b a c SHIWANI GUPTA

STRIPS Operators: alternative Formulation without Negation
States contain only atoms (positive literals) on(c,a), ontable(a) clear(c), ontable(b) clear(b), handempty( ) a c b STRIPS operators use a delete list instead of negated effects name(v1, ..., vn) Pre: atom, ..., atom Add: atom, ..., atom Del: atom, ..., atom unstack(?x,?y) Pre: on(?x,?y), clear(?x), handempty Del: on(?x,?y), clear(?x), handempty, Add: holding(?x), clear(?y) SHIWANI GUPTA

STRIPS Operators (alternative Formulation)
If O is applicable to S, then result(S,O) = (S - Del(O))  Add(O) a b c on(c,a), ontable(a), clear(c), ontable(b), clear(b), handempty( ) unstack(c,a) Pre: on(c,a), clear(c), handempty Del: on(c,a), clear(c), handempty Add: holding(c), clear(a) What is the difference to the formulation with Negation? a c b ontable(a), ontable(b), clear(b), holding(c), clear(a) SHIWANI GUPTA

Search Space for Breadth- First search a b c c b a CLEAR(A) ONTABLE(A)
CLEAR(B) ONTABLE(B) CLEAR(C) ONTABLE(C) HANDEMPTY putdown(B) putdown(A) pickup(B) pickup(C) pickup(A) Putdown(C) CLEAR(A) CLEAR(C) HOLDING(B) ONTABLE(A) ONTABLE(C) CLEAR(A) CLEAR(B) HOLDING(C) ONTABLE(A) ONTABLE(B) CLEAR(B) CLEAR(C) HOLDING(A) ONTABLE(B) ONTABLE(C) unstack(B, A) stack(C, A) unstack(C, A) stack(A, B) unstack(A, B) stack(B, A) stack(C, B) stack(A, C) unstack(A, C) stack(B, C) unstack(B, C) unstack(C, B) CLEAR(A) ON(B, C) CLEAR(B) ONTABLE(A) ONTABLE(C) HANDEMPTY CLEAR(C) ON(B, A) CLEAR(B) ONTABLE(A) ONTABLE(C) HANDEMPTY CLEAR(A) ON(C, B) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY CLEAR(B) ON(C, A) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY CLEAR(B) ON(A, C) CLEAR(A) ONTABLE(B) ONTABLE(C) HANDEMPTY CLEAR(C) ON(A, B) CLEAR(A) ONTABLE(B) ONTABLE(C) HANDEMPTY c b a putdown(C) pickup(B) pickup(c) putdown(B) pickup(C) putdown(C) pickup(A) putdown(A) pickup(A) putdown(B) pickup(B) putdown(B) ON(B, C) CLEAR(B) HOLDING(A) ONTABLE(C) ON(B, C) CLEAR(B) HOLDING(A) ONTABLE(C) ON(B, C) CLEAR(B) HOLDING(A) ONTABLE(C) ON(B, C) CLEAR(B) HOLDING(A) ONTABLE(C) ON(B, C) CLEAR(B) HOLDING(A) ONTABLE(C) ON(B, C) CLEAR(B) HOLDING(A) ONTABLE(C) a stack(C, B) unstack(C, B) stack(B, C) unstack(B, C) stack(C, A) unstack(C, A) stack(A, B) unstack(A, B) stack(A, C) unstack(A, C) stack(B, A) stack(B, A) b CLEAR(A) ON(A, B) ON(B, C) ONTABLE(C) HANDEMPTY CLEAR(C) ON(C, B) ON(B, A) ONTABLE(A) HANDEMPTY CLEAR(A) ON(A, C) ON(C, B) ONTABLE(B) HANDEMPTY CLEAR(B) ON(B, C) ON(C, A) ONTABLE(A) HANDEMPTY CLEAR(B) ON(B, A) ON(A, C) ONTABLE(C) HANDEMPTY CLEAR(C) ON(C, A) ON(A, B) ONTABLE(B) HANDEMPTY c

State-Space Search: State-space planning is a search in the space of states
B B A Initial state C B A B A C B A C C B B A B A A B C A C C A A B C Goal B C C B A A B C SHIWANI GUPTA

HIERARCHICAL PLANNING
5.1.3 HIERARCHICAL PLANNING Space craft assembly integration and verification: vidyalankar

Hierarchical Decomposition
Solution at a high level abstraction [Go(Supermarket),Buy(Milk),Buy(Bananas),Go(Home)] It is a long way from instruction fed to the agent’s effectors A low level plan [Forward(1 cm),Turn(1 deg),Forward(1 cm), ……] Hierarchical decomposition : an abstract operator can be decomposed into a group of steps eg. Abstract operator: Build(House) decomposed operators : obtain Permit,Hire Builder,Construction, Pay Builder Primitive operator:executed by the agent vidyalankar SHIWANI GUPTA

Hierarchical planning work
SHIWANI GUPTA

Extending STRIPS (1) partition operators into primitive and nonprimitive operators nonprimitive : Install(FloorBoards)…..decomposed into other tasks primitive : Hammer(Nail)…..directly executable (2) decomposition method Decompose(o,p) : An operator o is decomposed into a plan p Ordering: my coming to college before me taking lec Binding: me entering the class exactly before taking the lec SHIWANI GUPTA

Decomposition of o into p
The decomposed plan p correctly implements an operator if it is complete and consistent : 1. p must be consistent (no contradiction) 2. Every effect of o must be asserted by at least one step of p 3. Every precondition of the steps in p must be achieved by a step in p or be one of the preconditions of o Vidyalankar(sagar kadam) SHIWANI GUPTA

Analysis of Hierarchical Decomposition
Abstract solution : a plan containing abstract operators, but consistent and complete downward solution:if p is an abstract solution and there is a primitive solution upward solution:if an abstract plan is inconsistent then no primitive sol. SHIWANI GUPTA

Decomposition and Sharing
Merge each step of the decomposition into existing plan Divide-and-conquer approach:solve each subproblem and then combine it into the rest Sharing steps while merging eg. Clear semester exams and get degree (1) decomposition get admission and clear semester exams get admission and get degree (2) merge share the step “get admission” SHIWANI GUPTA

Decomposition and approximation
Hierarchical decomposition nonprimitive operator => primitives Hierarchical planning(approximation hierarchy, abstraction hierarchy) It takes an operator and partitions its precondition according to their criticality level Op(ACTION:Buy(x), EFFECT : Have(x)  Have(Money), PRECOND:1:Sells(store,x)  :At(store)  :Have(Money)) SHIWANI GUPTA

5.1.4 CONDITIONAL PLANNING vidyalankar

Conditional Planning Solution to incomplete (inaccessible environment) and incorrect (real world doesn’t match agent model) information Contingency/conditional planning : incomplete planning by constructing a conditional plan that accounts for each possible situation eg. Shopping agent wants to include a sensing action in its shopping plan to check price of some object if it is expensive Sensing action/Execution Monitoring: The agent includes sensing actions to find which part of the plan to be executed and when things go wrong eg. If agent discovers that it doesnot have enough money to pay for all items it has picked then it can return some and replace with cheaper ones SHIWANI GUPTA

eg. “Fixing a flat tire” (1) Possible operators
Op(ACTION:Remove(x), PRECOND:On(x), EFFECT:Off(x)  ClearHub(x)  On(x)) Op(ACTION:PutOn(x), PRECOND:Off(x)  ClearHub(x), EFFECT:On(x)  ClearHub(x)  Off(x)) Op(ACTION:Inflate(x), PRECOND:Intact(x)  Flat(x), EFFECT:Inflated(x)  Flat(x)) (2) goal On(x)  Inflated(x) (3) Initial conditions Inflated(Spare)Intact(Spare)Off(Spare)On(Tire1)Flat(Tire1) (4) Initial plan [Remove(Tire1), PutOn(Spare)] SHIWANI GUPTA

If(<condition>,<ThenPart>,<ElsePart>,)
The initial plan is good if there is no Intact(Tire1). But, if Tire1 is intact, only the inflation is needed Conditional step If(<condition>,<ThenPart>,<ElsePart>,) If(Intact(Tire1),[Inflate(Tire1)],[Remove(Tire1), PutOn(Spare)]) Sensing Action In our action schema format Op(ACTION:CheckTire(x), PRECOND:Tire(x), EFFECT:KnowsWhether(“Intact(x)”)) SHIWANI GUPTA

SHIWANI GUPTA

Two open conditions to be resolved
On(x) Inflated(x) Introduce operator Inflate(Tire1) preconditions Flat(Tire1) and Intact(Tire1) SHIWANI GUPTA

Precondition: Intact(Tire1) ?
There is no action that can make it satisfied But the action CheckTire(x) allows us to know the truth value of the preconditon  conditional step : Sensing action We add the CheckTire step to the plan with a conditional link :dotted arrow SHIWANI GUPTA

We add steps for the case where Tire1 is not intact: another Finish action
SHIWANI GUPTA

If we add Inflate(Tire1) to the new Finish step, the precondition Intact(Tire1) is inconsistent with Intact(Tire1). Therefore, we link the start step to Inflated step. SHIWANI GUPTA

We add Remove(Tire1), PutOn(Spare) to satisfy the condition On(Spare)
In the example, CheckTire can give Intact(Tire1) If we link from CheckTire to Remove(Tire1), then the Remove is no longer a threat SHIWANI GUPTA

SHIWANI GUPTA

University Questions Compare and contrast Problem Solving agent and Planning Agent Explain Partial Order Planning with the help of example “spare tyre problem” - Changing the flat tyre with spare one. The goal is to have a good spare tyre properly mounted on car axle whereas initially a flat tyre is on axle and a good spare tyre in trunk. Plan problem of putting on a pair of shoes Sagar Kadam vIDYAKANKAR SHIWANI GUPTA

5.2 LEARNING

Learning Learning is essential for unknown environments,
i.e., when designer lacks omniscience Learning is useful as a system construction method, i.e., expose the agent to reality rather than trying to write it down Learning modifies the agent's decision mechanisms to improve performance Lacks omniscience:doesn’t know everything SHIWANI GUPTA

Learning agents Sadguru education centre SHIWANI GUPTA

Learning element Responsible to improve performance element by taking knowledge about learning element, feedback on how agent doing from critic Then determines how the performance element to be modified to do better in future SHIWANI GUPTA

Performance element Consists of a collection of knowledge and procedures Responsible for selecting external actions by taking percepts from the environment through sensors and using its present knowledge base and procedures The agent uses the selected actions to act upon the environment using its effectors SHIWANI GUPTA

Critic Tells the learning element how well it is doing in the environment For this the critic employs a fixed standard of performance to determine how successful the agent is The performance element is to be planted outside the agent (else a human agent when unsuccessful at something would say that it didn’t really wanted it) SHIWANI GUPTA

Problem Generator Suggests suboptimal actions to agent enabling it to explore for new and informative experiences By doing suboptimal actions in short run, agent might discover much better actions for long run SHIWANI GUPTA

Example: Automated Taxi
Performance Element: Whatever collection of knowledge and procedures taxi has for selecting its driving actions i.e. turning, accelerating, braking, honking, etc. Learning Element: formulate goals by learning better rules as describing the effects of braking and accelerating to learn geography of the area, how taxi driver behaves on different conditions of the road. Also improves efficiency of performance element. Eg. Whwn asked to make a trip to a new location, the taxi driver may take sometime to consult its map and plan the best route. Next time a similar trip would be faster Critic: observes the world and passes information to the learning element eg. After the taxi makes a quick left turn across the 3 lanes of traffic, the critic observes the language used by other drivers, learning element formulates a rule saying it was a bad action, performance modified by installing a new rule Problem Generator: gives suggestions like which rule is faster and better SHIWANI GUPTA

Issues affecting design of learning element
Which components of the performance element are to be improved What representation is used for those components What feedback is available What prior information is available SHIWANI GUPTA

Components of the performance element
SHIWANI GUPTA

Representation of components
SHIWANI GUPTA

Available feedback SHIWANI GUPTA

Prior knowledge SHIWANI GUPTA

Inductive learning Simplest form: learn a function from examples
f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h such that h ≈ f given a training set of examples (This is a highly simplified model of real learning: Ignores prior knowledge Assumes examples are given) SHIWANI GUPTA

Inductive learning method
Construct/adjust h to agree with f on training set (h is consistent if it agrees with f on all examples) E.g., curve fitting: SHIWANI GUPTA

Inductive learning method
Construct/adjust h to agree with f on training set (h is consistent if it agrees with f on all examples) E.g., curve fitting: Ockham’s razor: prefer the simplest hypothesis consistent with data in explaining a thing no more assumptions should be made than are necessary SHIWANI GUPTA

Learning decision trees
Problem: decide whether to wait for a table at a restaurant, based on the following attributes: Alternate: is there an alternative restaurant nearby? Bar: is there a comfortable bar area to wait in? Fri/Sat: is today Friday or Saturday? Hungry: are we hungry? Patrons: number of people in the restaurant (None, Some, Full) Price: price range ($, $$, $$$) Raining: is it raining outside? Reservation: have we made a reservation? Type: kind of restaurant (French, Italian, Thai, Burger) Wait Estimate: estimated waiting time (0-10, 10-30, 30-60, >60) SHIWANI GUPTA

Attribute-based representations
Examples described by attribute values (Boolean, discrete, continuous) e.g., situations where I will/won't wait for a table: Classification of examples is positive (T) or negative (F) SHIWANI GUPTA

Decision trees One possible representation for hypotheses
E.g., here is the “true” tree for deciding whether to wait: Hypothesis: a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation. SHIWANI GUPTA

Expressiveness Decision trees can express any function of the input attributes. E.g., for Boolean functions, truth table row → path to leaf: Trivially, there is a consistent decision tree for any training set with one path to leaf for each example Prefer to find more compact decision trees SHIWANI GUPTA

Example: Play Tennis? SHIWANI GUPTA

Probabilities of Individual Attributes
Predict playing tennis when <sunny, cool, high, strong> What probability should be used to make the prediction? How to compute the probability? Given the training set, we can compute the probabilities P(+) = 9/14 P(−) = 5/14 SHIWANI GUPTA

Decision Tree for Playing Tennis (3 attributes)
Language is propositional SHIWANI GUPTA

Decision Tree (O=Sunny AND H=Normal) OR (O=Overcast) OR (O=Rain AND W=Weak) then YES “A disjunction of conjunctions of constraints on attribute values” Larger hypothesis space than Candidate-Elimination SHIWANI GUPTA

Decision tree representation
Boolean Function (each row in the truth table for a corresponding path) I/P: object/situation O/P: decision Each internal node corresponds to a test Each branch corresponds to a result of the test Each leaf node assigns a classification Once the tree is trained, a new instance is classified by starting at the root and following the path as dictated by the test results for this instance. SHIWANI GUPTA

Accessing performance of learning algorithm
A learning algorithm is good if it produces hypothesis on Training set as examples to predict classification of unseen examples Prediction is checked for correct classification: Test set Take different size of training set and randomly selected SHIWANI GUPTA

Tree Uses Nodes, and Leaves
SHIWANI GUPTA

Multivariate Trees SHIWANI GUPTA

Hypothesis spaces How many distinct decision trees with n Boolean attributes? = number of Boolean functions = number of distinct truth tables with 2n rows = 22n e.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees SHIWANI GUPTA

Hypothesis spaces How many distinct decision trees with n Boolean attributes? = number of Boolean functions = number of distinct truth tables with 2n rows = 22n e.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees How many purely conjunctive hypotheses (e.g., Hungry  Rain)? Each attribute can be in (positive), in (negative), or out  3n distinct conjunctive hypotheses More expressive hypothesis space increases chance that target function can be expressed increases number of hypotheses consistent with training set  may get worse predictions SHIWANI GUPTA

Decision tree learning
Aim: find a small tree consistent with the training examples Idea: (recursively) choose "most significant" attribute as root of (sub)tree SHIWANI GUPTA

Choosing an attribute Idea: a good attribute splits the examples into subsets that are (ideally) "all positive" or "all negative" Patrons? is a better choice SHIWANI GUPTA

Using information theory
To implement Choose-Attribute in the DTL algorithm Information Content (Entropy): I(P(v1), … , P(vn)) = Σi=1 -P(vi) log2 P(vi) For a training set containing p positive examples and n negative examples: SHIWANI GUPTA

Information gain A chosen attribute A divides the training set E into subsets E1, … , Ev according to their values for A, where A has v distinct values. Information Gain (IG) or reduction in entropy from the attribute test: Choose the attribute with the largest IG SHIWANI GUPTA

Example: Restaurant For the training set, p = n = 6, I(6/12, 6/12) = 1 bit Consider the attributes Patrons and Type (and others too): Patrons has the highest IG of all attributes and so is chosen by the DTL algorithm as the root SHIWANI GUPTA

Example: Tennis SHIWANI GUPTA

Partially learned tree
SHIWANI GUPTA

Example contd. Decision tree learned from the 12 examples:
Substantially simpler than “true” tree---a more complex hypothesis isn’t justified by small amount of data SHIWANI GUPTA

Learning Supervised Learning Unsupervised Learning
Recognizing hand-written digits, pattern recognition, regression. Labeled examples (input , desired output) Neural Network models: perceptron, feed-forward, radial basis function, support vector machine. Unsupervised Learning Find similar groups of documents in the web, content addressable memory, clustering. Unlabeled examples (different realizations of the input alone) Neural Network models: self organizing maps, Hopfield networks. Regression(o/p of the fn can be a cont. value) SHIWANI GUPTA

Reinforcement Learning
SHIWANI GUPTA

University Questions What are the basic building blocks of a learning agent? Explain each of them with neat block diagram What is Inductive learning? Explain decision tree with example Explain supervised unsupervised reinforcement learning with examples SHIWANI GUPTA

University Questions Construct a decision tree for the following set of samples. Write any two decision rules obtained from the tree Classify a new sample with (gender=“female” height=“1.92m”) Person ID Gender Height Class 1 Female 1.6m Short 2 Male 2m Tall 3 1.9m Medium 4 2.1m 5 1.7m 6 1.85m 7 8 9 2.2m SHIWANI GUPTA

Module 5 PLANNING AND LEARNING.

Similar presentations

Presentation on theme: "Module 5 PLANNING AND LEARNING."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Module 5 PLANNING AND LEARNING.

Similar presentations

Presentation on theme: "Module 5 PLANNING AND LEARNING."— Presentation transcript:

Similar presentations

About project

Feedback