Announcements  Upcoming due dates  M 11:59 pm HW 2  Homework 2  Submit via gradescope.com -- Course code: 9NXNZ9  No slip days for homeworks!  Project.

Announcements  Upcoming due dates  M 11:59 pm HW 2  Homework 2  Submit via gradescope.com -- Course code: 9NXNZ9  No slip days for homeworks!  Project 1  Out at the end of this week  May work in pairs (warning!)  Pro tips  Lots of learning resources (including the textbook!)  Getting by without really understanding  Probability Sessions

Announcements  Probability Sessions TP hot0.5 cold0.5 WP sun0.6 rain0.1 fog0.3 meteor0.0 P(T)P(T) P(W)P(W) Temperature Weather marginal can be obtained from joint by summing out use Bayes’ net joint distribution expression use x*(y+z) = xy + xz joining on a, and then summing out gives f 1 use x*(y+z) = xy + xz joining on e, and then summing out gives f 2

Announcements  Probability Sessions  Additional discussion sections focused on CS 188 probability  Optional attendance  Worksheets and solutions will be published – Strongly suggested  Weekly, starting next week  W 9-10 am, 299 Cory  W 6-7 pm, 531 Cory

AI in the News  Headlines this week http://www.ibtimes.com/dell-inc-announces-125b-investment-china-including-artificial-intelligence-lab-2090481 http://www.forbes.com/sites/dougnewcomb/2015/09/09/toyota-invests-50-million-in-artificial-intelligence-research-for-vehicle-robotics/ http://www.bizjournals.com/sanjose/news/2015/09/08/apple-on-hiring-spree-for-ai-experts.html

CS 188: Artificial Intelligence Local and uncertain search Instructors: Stuart Russell and Pat Virtue University of California, Berkeley

Local search algorithms  In many optimization problems, path is irrelevant; the goal state is the solution  Then state space = set of “complete” configurations; find configuration satisfying constraints, e.g., n-queens problem; or, find optimal configuration, e.g., travelling salesperson problem  In such cases, can use iterative improvement algorithms; keep a single “current” state, try to improve it  Constant space, suitable for online as well as offline search

Heuristic for n-queens problem  Goal: n queens on board with no conflicts, i.e., no queen attacking another  States: n queens on board, one per column  Heuristic value function: number of conflicts

Demo n-queens [Demo: n-queens – iterative improvement (L5D1)]

Hill-climbing algorithm function HILL-CLIMBING(problem) returns a state current ← make-node(problem.initial-state) loop do neighbor ← a highest-valued successor of current if neighbor.value ≤ current.value then return current.state current ← neighbor “Like climbing Everest in thick fog with amnesia”

Global and local maxima Random restarts  find global optimum  duh Random sideways moves  Escape from shoulders  Loop forever on flat local maxima

Hill-climbing on the 8-queens problem  No sideways moves:  Succeeds w/ prob. 0.14  Average number of moves per trial:  4 when succeeding, 3 when getting stuck  Expected total number of moves needed:  3(1-p)/p + 4 =~ 22 moves  Allowing 100 sideways moves:  Succeeds w/ prob. 0.94  Average number of moves per trial:  21 when succeeding, 65 when getting stuck  Expected total number of moves needed:  65(1-p)/p + 21 =~ 25 moves Moral: algorithms with knobs to twiddle are irritating

Simulated annealing  Resembles the annealing process used to cool metals slowly to reach an ordered (low-energy) state  Basic idea:  Allow “bad” moves occasionally, depending on “temperature”  High temperature => more bad moves allowed, shake the system out of its local minimum  Gradually reduce temperature according to some schedule  Sounds pretty flaky, doesn’t it?  Theorem: simulated annealing finds the global optimum with probability 1 for a slow enough cooling schedule

Simulated annealing algorithm function SIMULATED-ANNEALING(problem,schedule) returns a state current ← make-node(problem.initial-state) for t = 1 to ∞ do T ←schedule(t) if T = 0 then return current next ← a randomly selected successor of current ∆E ← next.value – current.value if ∆E > 0 then current ← next else current ← next only with probability e ∆E/T

Local beam search  Basic idea:  K copies of a local search algorithm, initialized randomly  For each iteration  Generate ALL successors from K current states  Choose best K of these to be the new current states  Why is this different from K local searches in parallel?  The searches communicate! “Come over here, the grass is greener!”  What other well-known algorithm does this remind you of?  Evolution! Or, K chosen randomly with a bias towards good ones Or, K chosen randomly with a bias towards good ones

Genetic algorithms  Genetic algorithms use a natural selection metaphor  Keep best N hypotheses at each step (selection) based on a fitness function  Also have pairwise crossover operators, with optional mutation to give variety

Genetic algorithms example: n-queens

Uncertain search

Searching in the real world  Nondeterminism: actions have unpredictable effects  Modified problem formulation to allow multiple outcomes  Solutions are now contingency plans  New algorithm to find them: AND-OR search  May need plans with loops!  Partial observability: percept is not the whole state  New concept: belief state = set of states agent could be in  Modified formulation for search in belief state space; add observation model  Simple and general agent design  Nondeterminism and partial observability

The erratic vacuum world  If square is dirty, Suck sometimes cleans up dirt in adjacent square as well  E.g., state 1 could go to 5 or 7  If square is clean, Suck may dump dirt on it by accident  E.g., state 4 could go to 4 or 2

Problem formulation  Results(s,a) returns a set of states  Results(1,Suck) = {5,7}  Results(4,Suck) = {2,4}  Results(1,Right) = {2}  Everything else is the same as before

Contingent solutions  From state 1, how do we solve the problem?  How about [Suck]?  Not necessarily!  What about [Suck,Right,Suck]?  Not necessarily!  [Suck; if state=5 then [Right,Suck] else []]  This is a contingent solution (a.k.a. a branching or conditional plan)  Great! So, how do we find such solutions?

AND-OR search trees  OR-node:  Standard search tree is all OR-nodes  Agent chooses action;  At least one branch must be solved  AND-node:  Nature chooses outcome;  All branches must be solved

AND-OR search trees  An AND-OR search tree  Choice of actions (OR)  Possible outcomes (AND)  But what does the contingent solution look like?  Still a tree, but with actions selected (if this, do that, else if…)  May still have loops

AND-OR search made easy AND-OR search: call OR-Search on the root node OR-search(node): succeeds if AND-search succeeds on the outcome set for any action AND-search(set of nodes): succeeds if OR-search succeeds on ALL nodes in the set

AND-OR search function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan, or failure OR-SEARCH(problem.initial-state,problem,[]) function OR-SEARCH(state,problem,path) returns a conditional plan, or failure if problem.goal-test(state) then return the empty plan if state is on path then return failure for each action in problem.actions(state) do plan ← AND-SEARCH(results(state,action),problem,[state | path]) if plan  failure then return [action | plan] return failure

AND-OR search contd. function AND-SEARCH(states,problem,path) returns a conditional plan, or failure for each s i in states do plan i ← OR-SEARCH(s i,problem,path) if plan i = failure then return failure return [if s 1 then plan 1 else if s 2 then plan 2 else... if s n−1 then plan n−1 else plan n ]

Slippery vacuum world  Sometimes movement fails  There is no guaranteed contingent solution!  There is a cyclic solution:  [Suck, L1]  Here L1 is a label  L1: Right, if State = 5 then L1 else Suck  Modify AND-OR-GRAPH-SEARCH to add a label when it finds a repeated state, try adding a branch to the label instead of failure  A cyclic plan is a cyclic solution if  Every leaf is a goal state  From every point in the plan there is a path to a leaf

What does nondeterminism really mean?  Example: your hotel key card doesn’t open the door of your room  Explanation 1: you didn’t put it in quite right  This is nondeterminism; keep trying  Explanation 2: something wrong with the key  This is partial observability; get a new key  Explanation 3: it isn’t your room  This is embarrassing; be ashamed  A nondeterministic model is appropriate when outcomes don’t depend deterministically on some hidden state

Partial observability

Extreme partial observability: Sensorless worlds  Vacuum world with known geometry, but no sensors at all!  Belief state: set of all environment states the agent could be in  More generally, what the agent knows given all percepts to date Right Suck Left Suck

States, belief states, belief state space  Example: 5 grid positions {A,B,C,D,E} may be occupied by 0-5 ghosts A B C D E State space representation:  5 Booleans (a,b,c,d,e) Size of the state space:  Power set of states: 2 5 Example states: (1,1,1,1,1) ghosts everywhere (0,0,0,0,0) no ghosts (1,1,0,0,0) ghosts in just A and B Example belief states: I believe that there are ghosts everywhere or no ghosts at all (1,1,1,1,1) (0,0,0,0,0) I believe that there is exactly one ghost (1,0,0,0,0) (0,1,0,0,0) (0,0,1,0,0) (0,0,0,0,1) (0,0,0,0,1) Empty belief state Every configuration is possible (0,0,0,0,0) (1,0,0,0,0)... (0,1,0,0,0) (1,1,0,0,0) … (0,0,1,0,0) (1,0,1,0,0) … (0,1,1,0,0) (1,1,1,0,0) … (0,0,0,1,0) (1,0,0,1,0) … …

Sensorless problem formulation  Underlying “physical” problem has Actions P, Result P, Goal-Test P, and Step-Cost P.  Initial state: a belief state b (set of physical states s)  N physical states => 2 N belief states  Goal test: every element s in b satisfies Goal-Test P (s)  Actions: union of Actions P (s) for each s in b  This is OK if doing an “illegal” action has no effect  Transition model:  Deterministic: Result(b,a) = union of Result P (s,a) for each s in b  Nondeterministic: Result(b,a) = union of Results P (s,a) for each s in b  Step-Cost(b,a,b’) = Step-Cost P (s,a,s’) for any s in b

Search in sensorless belief state space  Everything works exactly as before!  Solutions are still action sequences!  Some opportunities for improvement:  If any s in b is unsolvable, b is unsolvable  If b’ is a superset of b and b’ has a solution, b has same solution  If b’ is superset of b and we already found a path to b in our tree, discard b’

What use are sensorless problems?  They correspond to many real-world “robotic” manipulation problems  A “part orientation conveyor” consists of a sequence of slanted guides that orient the part correctly no matter what its initial orientation  It’s a lot cheaper and more reliable than using a camera and robot arm! https://www.youtube.com/watch?v=QsJzSFVAnhk

Partial observability: formulation  Partially observable problem formulation has to say what the agent can observe:  Deterministic: Percept(s) is the percept received in physical state s  Nondeterministic: Percepts(s) is the set of possible percepts received in s  Fully observable: Percept(s) = s Sensorless: Percept(s)=null  Local sensing vacuum world  Percept(s1) =  Percept(s3) =  Local sensing vacuum world  Percept(s1) = [A,Dirty]  Percept(s3) =  Local sensing vacuum world  Percept(s1) = [A,Dirty]  Percept(s3) = [A,Dirty]

Partial observability: belief state transition model  b’ = Predict(b,a) updates the belief state just for the action  Identical to transition model for sensorless problems  Possible-Percepts(b’) is the set of percepts that could come next  Union of Percept(s) for every s in b’  Update(b’,p) is the new belief state if percept p is received  Just the states s in b’ for which p = Percept(s)  Results(b,a) contains  Update(Predict(b,a),p) for each p in Possible-Percepts(Predict(b,a))

Example: Results(b 0,Right) b0b0 Right

Example: Results(b 0,Right) b0b0 b’=Predict(b 0,Right) Update(b’,[B,Dirty]) Update(b’,[B,Clean]) Possible-Percepts(b’)

Maintaining belief state in an agent  Percept is p given by the environment  Repeat after me:  b <- Update(Predict(b,a),p)  This is the predict-update cycle  Also known as monitoring, filtering, state estimation  Localization and mapping are two special cases

 Nondeterminism requires contingent plans  AND-OR search finds them  Sensorless problems require ordinary plans  Search in belief state space to find them  General partial observability induces nondeterminism for percepts  AND-OR search in belief state space  Predict-Update cycle for belief state transitions Summary

Announcements  Upcoming due dates  M 11:59 pm HW 2  Homework 2  Submit via gradescope.com -- Course code: 9NXNZ9  No slip days for homeworks!  Project.

Similar presentations

Presentation on theme: "Announcements  Upcoming due dates  M 11:59 pm HW 2  Homework 2  Submit via gradescope.com -- Course code: 9NXNZ9  No slip days for homeworks!  Project."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Announcements  Upcoming due dates  M 11:59 pm HW 2  Homework 2  Submit via gradescope.com -- Course code: 9NXNZ9  No slip days for homeworks!  Project.

Similar presentations

Presentation on theme: "Announcements  Upcoming due dates  M 11:59 pm HW 2  Homework 2  Submit via gradescope.com -- Course code: 9NXNZ9  No slip days for homeworks!  Project."— Presentation transcript:

Similar presentations

About project

Feedback