Announcements  Project 1: Search  Due Wednesday 9/24  Solo or in group of two. For group of two: both of you need to submit your code into edX!  Note:

Announcements  Project 1: Search  Due Wednesday 9/24  Solo or in group of two. For group of two: both of you need to submit your code into edX!  Note: don’t expect IDS with vanilla graph DFS to be optimal  Homework 2: Heuristics and Local Search  Part I AND Part II due tomorrow  Part I through edX – online, instant grading, submit as often as you like.  Part II through www.pandagrader.com -- submit pdfwww.pandagrader.com  Homework 3: Search and games  Has been released! Due Monday 9/22  No office hours today, sorry

CS 188: Artificial Intelligence Local search and agents Instructor: Stuart Russell University of California, Berkeley

Local search algorithms  In many optimization problems, path is irrelevant; the goal state is the solution  Then state space = set of “complete” configurations; find configuration satisfying constraints, e.g., n-queens problem; or, find optimal configuration, e.g., travelling salesperson problem  In such cases, can use iterative improvement algorithms; keep a single “current” state, try to improve it  Constant space, suitable for online as well as offline search

Heuristic for n-queens problem  Goal: n queens on board with no conflicts, i.e., no queen attacking another  States: n queens on board, one per column  Heuristic value function: number of conflicts

Hill-climbing algorithm function HILL-CLIMBING(problem) returns a state current ← make-node(problem.initial-state) loop do neighbor ← a highest-valued successor of current if neighbor.value ≤ current.value then return current.state current ← neighbor “Like climbing Everest in thick fog with amnesia”

Global and local maxima Random restarts  find global optimum  duh Random sideways moves  Escape from shoulders  Loop forever on flat local maxima

Hill-climbing on the 8-queens problem  No sideways moves:  Succeeds w/ prob. 0.14  Average number of moves per trial:  4 when succeeding, 3 when getting stuck  Expected total number of moves needed:  3(1-p)/p + 4 =~ 22 moves  Allowing 100 sideways moves:  Succeeds w/ prob. 0.94  Average number of moves per trial:  21 when succeeding, 65 when getting stuck  Expected total number of moves needed:  65(1-p)/p + 21 =~ 25 moves Moral: algorithms with knobs to twiddle are irritating

Simulated annealing  Resembles the annealing process used to cool metals slowly to reach an ordered (low-energy) state  Basic idea:  Allow “bad” moves occasionally, depending on “temperature”  High temperature => more bad moves allowed, shake the system out of its local minimum  Gradually reduce temperature according to some schedule  Sounds pretty flaky, doesn’t it?  Theorem: simulated annealing finds the global optimum with probability 1 for a slow enough cooling schedule

Simulated annealing algorithm function SIMULATED-ANNEALING(problem,schedule) returns a state current ← make-node(problem.initial-state) for t = 1 to ∞ do T ←schedule(t) if T = 0 then return current next ← a randomly selected successor of current ∆E ← next.value – current.value if ∆E > 0 then current ← next else current ← next only with probability e ∆E/T

Local beam search  Basic idea:  K copies of a local search algorithm, initialized randomly  For each iteration  Generate ALL successors from K current states  Choose best K of these to be the new current states  Why is this different from K local searches in parallel?  The searches communicate! “Come over here, the grass is greener!”  What other well-known algorithm does this remind you of?  Evolution! Or, K chosen randomly with a bias towards good ones Or, K chosen randomly with a bias towards good ones

Searching in the real world  Nondeterminism: actions have unpredictable effects  Modified problem formulation to allow multiple outcomes  Solutions are now contingency plans  New algorithm to find them: AND-OR search  May need plans with loops!  Partial observability: percept is not the whole state  New concept: belief state = set of states agent could be in  Modified formulation for search in belief state space; add observation model  Simple and general agent design  Nondeterminism and partial observability

The erratic vacuum world  If square is dirty, Suck sometimes cleans up dirt in adjacent square as well  E.g., state 1 could go to 5 or 7  If square is clean, Suck may dump dirt on it by accident  E.g., state 4 could go to 4 or 2

Problem formulation  Results(s,a) returns a set of states  Results(1,Suck) = {5,7}  Results(4,Suck) = {2,4}  Results(1,Right) = {2}  Everything else is the same as before

Contingent solutions  From state 1, does [Suck] solve the problem?  Not necessarily!  What about [Suck,Right,Suck]?  Not necessarily!  [Suck; if state=5 then [Right,Suck] else []]  This is a contingent solution (a.k.a. a branching or conditional plan)  Great! So, how do we find such solutions?

AND-OR search trees  OR-node:  Agent chooses action;  At least one branch must be solved  AND-node:  Nature chooses outcome;  All branches must be solved

AND-OR search made easy AND-OR search: call OR-Search on the root node OR-search(node): succeeds if AND-search succeeds on the outcome set for any action AND-search(set of nodes): succeeds if OR-search succeeds on ALL nodes in the set

AND-OR search function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan, or failure OR-SEARCH(problem.initial-state,problem,[]) function OR-SEARCH(state,problem,path) returns a conditional plan, or failure if problem.goal-test(state) then return the empty plan if state is on path then return failure for each action in problem.actions(state) do plan ← AND-SEARCH(results(state,action),problem,[state | path]) if plan ̸= failure then return [action | plan] return failure

AND-OR search contd. function AND-SEARCH(states,problem,path) returns a conditional plan, or failure for each s i in states do plan i ← OR-SEARCH(s i,problem,path) if plan i = failure then return failure return [if s 1 then plan 1 else if s 2 then plan 2 else... if s n−1 then plan n−1 else plan n ]

Slippery vacuum world  Sometimes movement fails  There is no guaranteed contingent solution!  There is a cyclic solution:  [Suck, L1 : Right, if State = 5 then L1 else Suck]  Here L1 is a label  Modify AND-OR-GRAPH-SEARCH to add a label when it finds a repeated state, try adding a branch to the label instead of failure  A cyclic plan is a cyclic solution if  Every leaf is a goal state  From every point in the plan there is a path to a leaf

What does nondeterminism really mean?  Example: your hotel key card doesn’t open the door of your room  Explanation 1: you didn’t put it in quite right  This is nondeterminism; keep trying  Explanation 2: something wrong with the key  This is partial observability; get a new key  Explanation 3: it isn’t your room  This is embarrassing; get a new brain  A nondeterministic model is appropriate when outcomes don’t depend deterministically on some hidden state

Partial observability

Extreme partial observability: Sensorless worlds  Vacuum world with known geometry, dirt, but no sensors at all!  Belief state: set of all environment states the agent could be in  More generally, what the agent knows given all percepts to date Right Suck Left Suck

Sensorless problem formulation  Underlying “physical” problem has Actions P, Result P, Goal-Test P, and Step-Cost P.  Initial state: a belief state b (set of physical states s)  N physical states => 2 N belief states  Goal test: every element s in b satisfies Goal-Test P (s)  Actions: union of Actions P (s) for each s in b  This is OK if doing an “illegal” action has no effect  Transition model:  Deterministic: Result(b,a) = union of Result P (s,a) for each s in b  Nondeterministic: Result(b,a) = union of Results P (s,a) for each s in b  Step-Cost(b,a,b’) = Step-Cost P (s,a,s’) for any s in b

Search in sensorless belief state space  Everything works exactly as before!  Solutions are still action sequences!  Some opportunities for improvement:  If any s in b is unsolvable, b is unsolvable  If b’ is superset of b and b is in tree, discard b’  If b’ is a superset of b and b’ has a solution, b has same solution

What use are sensorless problems?  They correspond to many real-world “robotic” manipulation problems  A “part orientation conveyor” consists of a sequence of slanted guides that orient the part correctly no matter what its initial orientation  It’s a lot cheaper and more reliable than using a camera and robot arm!

Partial observability: formulation  Partially observable problem formulation has to say what the agent can observe:  Deterministic: Percept(s) is the percept received in physical state s  Nondeterministic: Percepts(s) is the set of possible percepts received in s  Fully observable: Percept(s) = s Sensorless: Percept(s)=null  Local sensing vacuum world  Percept(s1) =  Percept(s3) =  Local sensing vacuum world  Percept(s1) = [A,Dirty]  Percept(s3) =  Local sensing vacuum world  Percept(s1) = [A,Dirty]  Percept(s3) = [A,Dirty]

Partial observability: belief state transition model  b’ = Predict(b,a) updates the belief state just for the action  Identical to transition model for sensorless problems  Possible-Percepts(b’) is the set of percepts that could come next  Union of Percept(s) for every s in b’  Update(b’,p) is the new belief state if percept p is received  Just the states s in b’ for which p = Percept(s)  Results(b,a) contains  Update(Predict(b,a),p) for each p in Possible-Percepts(Predict(b,a))

Example: Results(b 0,Right) b0b0 b’=Predict(b 0,Right) Update(b’,[B,Dirty]) Update(b’,[B,Clean]) Possible-Percepts(b’)

Maintaining belief state in an agent  Percept is p given by the environment  Repeat after me:  b <- Update(Predict(b,a),p)  This is the predict-update cycle  Also known as monitoring, filtering, state estimation  Localization and mapping are two special cases

 Nondeterminism requires contingent plans  AND-OR search finds them  Sensorless problems require ordinary plans  Search in belief state space to find them  General partial observability induces nondeterminism for percepts  AND-OR search in belief state space  Predict-Update cycle for belief state transitions Summary

Announcements  Project 1: Search  Due Wednesday 9/24  Solo or in group of two. For group of two: both of you need to submit your code into edX!  Note:

Similar presentations

Presentation on theme: "Announcements  Project 1: Search  Due Wednesday 9/24  Solo or in group of two. For group of two: both of you need to submit your code into edX!  Note:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Announcements  Project 1: Search  Due Wednesday 9/24  Solo or in group of two. For group of two: both of you need to submit your code into edX!  Note:

Similar presentations

Presentation on theme: "Announcements  Project 1: Search  Due Wednesday 9/24  Solo or in group of two. For group of two: both of you need to submit your code into edX!  Note:"— Presentation transcript:

Similar presentations

About project

Feedback