Download presentation
Presentation is loading. Please wait.
Published byLouisa Watkins Modified over 9 years ago
1
Announcements Project 1: Search Due Wednesday 9/24 Solo or in group of two. For group of two: both of you need to submit your code into edX! Note: don’t expect IDS with vanilla graph DFS to be optimal Homework 2: Heuristics and Local Search Part I AND Part II due tomorrow Part I through edX – online, instant grading, submit as often as you like. Part II through www.pandagrader.com -- submit pdfwww.pandagrader.com Homework 3: Search and games Has been released! Due Monday 9/22 No office hours today, sorry
2
CS 188: Artificial Intelligence Local search and agents Instructor: Stuart Russell University of California, Berkeley
3
Local search algorithms In many optimization problems, path is irrelevant; the goal state is the solution Then state space = set of “complete” configurations; find configuration satisfying constraints, e.g., n-queens problem; or, find optimal configuration, e.g., travelling salesperson problem In such cases, can use iterative improvement algorithms; keep a single “current” state, try to improve it Constant space, suitable for online as well as offline search
4
Heuristic for n-queens problem Goal: n queens on board with no conflicts, i.e., no queen attacking another States: n queens on board, one per column Heuristic value function: number of conflicts
5
Hill-climbing algorithm function HILL-CLIMBING(problem) returns a state current ← make-node(problem.initial-state) loop do neighbor ← a highest-valued successor of current if neighbor.value ≤ current.value then return current.state current ← neighbor “Like climbing Everest in thick fog with amnesia”
6
Global and local maxima Random restarts find global optimum duh Random sideways moves Escape from shoulders Loop forever on flat local maxima
7
Hill-climbing on the 8-queens problem No sideways moves: Succeeds w/ prob. 0.14 Average number of moves per trial: 4 when succeeding, 3 when getting stuck Expected total number of moves needed: 3(1-p)/p + 4 =~ 22 moves Allowing 100 sideways moves: Succeeds w/ prob. 0.94 Average number of moves per trial: 21 when succeeding, 65 when getting stuck Expected total number of moves needed: 65(1-p)/p + 21 =~ 25 moves Moral: algorithms with knobs to twiddle are irritating
8
Simulated annealing Resembles the annealing process used to cool metals slowly to reach an ordered (low-energy) state Basic idea: Allow “bad” moves occasionally, depending on “temperature” High temperature => more bad moves allowed, shake the system out of its local minimum Gradually reduce temperature according to some schedule Sounds pretty flaky, doesn’t it? Theorem: simulated annealing finds the global optimum with probability 1 for a slow enough cooling schedule
9
Simulated annealing algorithm function SIMULATED-ANNEALING(problem,schedule) returns a state current ← make-node(problem.initial-state) for t = 1 to ∞ do T ←schedule(t) if T = 0 then return current next ← a randomly selected successor of current ∆E ← next.value – current.value if ∆E > 0 then current ← next else current ← next only with probability e ∆E/T
10
Local beam search Basic idea: K copies of a local search algorithm, initialized randomly For each iteration Generate ALL successors from K current states Choose best K of these to be the new current states Why is this different from K local searches in parallel? The searches communicate! “Come over here, the grass is greener!” What other well-known algorithm does this remind you of? Evolution! Or, K chosen randomly with a bias towards good ones Or, K chosen randomly with a bias towards good ones
11
Searching in the real world Nondeterminism: actions have unpredictable effects Modified problem formulation to allow multiple outcomes Solutions are now contingency plans New algorithm to find them: AND-OR search May need plans with loops! Partial observability: percept is not the whole state New concept: belief state = set of states agent could be in Modified formulation for search in belief state space; add observation model Simple and general agent design Nondeterminism and partial observability
12
The erratic vacuum world If square is dirty, Suck sometimes cleans up dirt in adjacent square as well E.g., state 1 could go to 5 or 7 If square is clean, Suck may dump dirt on it by accident E.g., state 4 could go to 4 or 2
13
Problem formulation Results(s,a) returns a set of states Results(1,Suck) = {5,7} Results(4,Suck) = {2,4} Results(1,Right) = {2} Everything else is the same as before
14
Contingent solutions From state 1, does [Suck] solve the problem? Not necessarily! What about [Suck,Right,Suck]? Not necessarily! [Suck; if state=5 then [Right,Suck] else []] This is a contingent solution (a.k.a. a branching or conditional plan) Great! So, how do we find such solutions?
15
AND-OR search trees OR-node: Agent chooses action; At least one branch must be solved AND-node: Nature chooses outcome; All branches must be solved
16
AND-OR search made easy AND-OR search: call OR-Search on the root node OR-search(node): succeeds if AND-search succeeds on the outcome set for any action AND-search(set of nodes): succeeds if OR-search succeeds on ALL nodes in the set
17
AND-OR search function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan, or failure OR-SEARCH(problem.initial-state,problem,[]) function OR-SEARCH(state,problem,path) returns a conditional plan, or failure if problem.goal-test(state) then return the empty plan if state is on path then return failure for each action in problem.actions(state) do plan ← AND-SEARCH(results(state,action),problem,[state | path]) if plan ̸= failure then return [action | plan] return failure
18
AND-OR search contd. function AND-SEARCH(states,problem,path) returns a conditional plan, or failure for each s i in states do plan i ← OR-SEARCH(s i,problem,path) if plan i = failure then return failure return [if s 1 then plan 1 else if s 2 then plan 2 else... if s n−1 then plan n−1 else plan n ]
19
Slippery vacuum world Sometimes movement fails There is no guaranteed contingent solution! There is a cyclic solution: [Suck, L1 : Right, if State = 5 then L1 else Suck] Here L1 is a label Modify AND-OR-GRAPH-SEARCH to add a label when it finds a repeated state, try adding a branch to the label instead of failure A cyclic plan is a cyclic solution if Every leaf is a goal state From every point in the plan there is a path to a leaf
20
What does nondeterminism really mean? Example: your hotel key card doesn’t open the door of your room Explanation 1: you didn’t put it in quite right This is nondeterminism; keep trying Explanation 2: something wrong with the key This is partial observability; get a new key Explanation 3: it isn’t your room This is embarrassing; get a new brain A nondeterministic model is appropriate when outcomes don’t depend deterministically on some hidden state
21
Partial observability
22
Extreme partial observability: Sensorless worlds Vacuum world with known geometry, dirt, but no sensors at all! Belief state: set of all environment states the agent could be in More generally, what the agent knows given all percepts to date Right Suck Left Suck
23
Sensorless problem formulation Underlying “physical” problem has Actions P, Result P, Goal-Test P, and Step-Cost P. Initial state: a belief state b (set of physical states s) N physical states => 2 N belief states Goal test: every element s in b satisfies Goal-Test P (s) Actions: union of Actions P (s) for each s in b This is OK if doing an “illegal” action has no effect Transition model: Deterministic: Result(b,a) = union of Result P (s,a) for each s in b Nondeterministic: Result(b,a) = union of Results P (s,a) for each s in b Step-Cost(b,a,b’) = Step-Cost P (s,a,s’) for any s in b
24
Search in sensorless belief state space Everything works exactly as before! Solutions are still action sequences! Some opportunities for improvement: If any s in b is unsolvable, b is unsolvable If b’ is superset of b and b is in tree, discard b’ If b’ is a superset of b and b’ has a solution, b has same solution
25
What use are sensorless problems? They correspond to many real-world “robotic” manipulation problems A “part orientation conveyor” consists of a sequence of slanted guides that orient the part correctly no matter what its initial orientation It’s a lot cheaper and more reliable than using a camera and robot arm!
26
Partial observability: formulation Partially observable problem formulation has to say what the agent can observe: Deterministic: Percept(s) is the percept received in physical state s Nondeterministic: Percepts(s) is the set of possible percepts received in s Fully observable: Percept(s) = s Sensorless: Percept(s)=null Local sensing vacuum world Percept(s1) = Percept(s3) = Local sensing vacuum world Percept(s1) = [A,Dirty] Percept(s3) = Local sensing vacuum world Percept(s1) = [A,Dirty] Percept(s3) = [A,Dirty]
27
Partial observability: belief state transition model b’ = Predict(b,a) updates the belief state just for the action Identical to transition model for sensorless problems Possible-Percepts(b’) is the set of percepts that could come next Union of Percept(s) for every s in b’ Update(b’,p) is the new belief state if percept p is received Just the states s in b’ for which p = Percept(s) Results(b,a) contains Update(Predict(b,a),p) for each p in Possible-Percepts(Predict(b,a))
28
Example: Results(b 0,Right) b0b0 b’=Predict(b 0,Right) Update(b’,[B,Dirty]) Update(b’,[B,Clean]) Possible-Percepts(b’)
29
Maintaining belief state in an agent Percept is p given by the environment Repeat after me: b <- Update(Predict(b,a),p) This is the predict-update cycle Also known as monitoring, filtering, state estimation Localization and mapping are two special cases
30
Nondeterminism requires contingent plans AND-OR search finds them Sensorless problems require ordinary plans Search in belief state space to find them General partial observability induces nondeterminism for percepts AND-OR search in belief state space Predict-Update cycle for belief state transitions Summary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.