Agents & Search Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Dan Klein, Stuart Russell, Andrew Moore, Svetlana.

Agents & Search Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Dan Klein, Stuart Russell, Andrew Moore, Svetlana Lazebnik, Percy Liang, Luke Zettlemoyer

Course Information Instructor: Tamara Berg (tlberg@cs.unc.edu)tlberg@cs.unc.edu Office Hours: FB 236, Mon/Wed 11:25-12:25pm Course website: http://tamaraberg.com/teaching/Fall_15/http://tamaraberg.com/teaching/Fall_15/ Course mailing list: comp560@cs.unc.educomp560@cs.unc.edu TA: Patrick (Ric) Poirson TA office hours: SN 109, Tues/Thurs 4-5pm Announcements, readings, schedule, etc, will all be posted to the course webpage. Schedule may be modified as needed over the semester. Check frequently!

Announcements for today If you missed the first class please download and look over the slides from last week (course information, grading information, requirements, introduction to AI, etc). Check to make sure you have a directory on classroom.unc.edu in /afs/cs.unc.edu/project/courses/comp560- f15/users/

Agents

An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators

Example: Vacuum-Agent Percepts: Location and status, e.g., [A,Dirty] Actions: Left, Right, Suck, Dump, NoOp function Vacuum-Agent([location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left

Rational agents For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and the agent’s built-in knowledge Performance measure (utility function): An objective criterion for success of an agent's behavior

Example: Vacuum-Agent Percepts: Location and status, e.g., [A,Dirty] Actions: Left, Right, Suck, Dump, NoOp Potential performance measures for our vacuum agent? – Amount of dirt the agent cleans in an 8 hour shirt – Reward for having a clean floor, e.g. awarded for each clean square at each time step

Example: Vacuum-Agent Percepts: Location and status, e.g., [A,Dirty] Actions: Left, Right, Suck, Dump, NoOp function Vacuum-Agent([location,status]) returns an action if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left Is this agent rational?

Specifying the task environment Problem specification: Performance measure, Environment, Actuators, Sensors (PEAS) Example: autonomous taxi – Performance measure Safe, fast, legal, comfortable trip, maximize profits – Environment Roads, other traffic, pedestrians, customers – Actuators Steering wheel, accelerator, brake, signal, horn – Sensors Cameras, LIDAR, speedometer, GPS, odometer, engine sensors, keyboard

Another PEAS example: Spam filter Performance measure – Minimizing false positives, false negatives Environment – A user’s email account, email server Actuators – Mark as spam, delete, etc. Sensors – Incoming messages, other information about user’s account

Environment types Fully observable vs. partially observable Deterministic vs. stochastic Episodic vs. sequential Static vs. dynamic Discrete vs. continuous Single agent vs. multi-agent Known vs. unknown

Fully observable vs. partially observable Do the agent's sensors give it access to the complete state of the environment? vs.

Deterministic vs. stochastic Is the next state of the environment completely determined by the current state and the agent’s action? vs.

Episodic vs. sequential Is the agent’s experience divided into unconnected episodes, or is it a coherent sequence of observations and actions? vs.

Static vs. dynamic Is the world changing while the agent is thinking? Semi-dynamic: the environment does not change with the passage of time, but the agent's performance score does vs.

Discrete vs. continuous Does the environment provide a fixed number of distinct percepts, actions, and environment states? – Time can also evolve in a discrete or continuous fashion vs.

Single-agent vs. multiagent Is an agent operating by itself in the environment? vs.

Known vs. unknown Are the rules of the environment (transitions and rewards) known to the agent? – Strictly speaking, not a property of the environment, but of the agent’s state of knowledge vs.

Examples of different environments Observable Deterministic Episodic Static Discrete Single agent FullyPartially DeterministicStochastic Sequential Semidynamic Dynamic Static Discrete Continuous Multi Fully Deterministic Episodic Static Discrete Single Chess with a clock ScrabbleAutonomous driving Word jumble solver

Preview Deterministic environments: search, constraint satisfaction, logic Multi-agent, strategic environments: minimax search, games – Can also be stochastic, partially observable Stochastic environments – Episodic: Bayesian networks, pattern classifiers – Sequential, known: Markov decision processes – Sequential, unknown: reinforcement learning

Solving problems by searching Image source: Wikipedia

Types of agents Reflex agent Consider how the world IS Choose action based on current percept (and maybe memory or a model of the world’s current state) Do not consider the future consequences of their actions Planning agent Consider how the world WOULD BE Decisions based on (hypothesized) consequences of actions Must have a model of how the world evolves in response to actions Must formulate a goal (test)

Search We will consider the problem of designing goal-based agents in fully observable, deterministic, discrete, known environments Start state Goal state

Search We will consider the problem of designing goal-based agents in fully observable, deterministic, discrete, known environments – The agent must find a sequence of actions that reaches a goal – The performance measure is defined by (a) reaching a goal and (b) how “expensive” the path to the goal is – We are focused on the process of finding the solution; while executing the solution, we assume that the agent can safely ignore its percepts.

Search problem components Initial state Actions Transition model – What state results from performing a given action in a given state? Goal state Path cost – Assume that it is a sum of nonnegative action costs The optimal solution is the sequence of actions that gives the lowest path cost for reaching the goal Initial state Goal state

Example: Romania On vacation in Romania; currently in Arad Flight leaves tomorrow from Bucharest Initial state – Arad Actions – Go from one city to another Transition model – If you go from city A to city B, you end up in city B Goal state – Bucharest Path cost – Sum of city to city travel costs (total distance traveled)

State space The initial state, actions, and transition model define the state space of the problem – The set of all states reachable from initial state by any sequence of actions – Can be represented as a directed graph where the nodes are states and links between nodes are actions

State space The initial state, actions, and transition model define the state space of the problem – The set of all states reachable from initial state by any sequence of actions – Can be represented as a directed graph where the nodes are states and links between nodes are actions What is the state space for the Romania problem?

Example: Vacuum world States – Agent location and dirt location – How many possible states?

Vacuum world state space graph

Example: Vacuum world States – Agent location and dirt location – How many possible states? – What if there are n possible locations? The size of the state space grows exponentially with the “size” of the world!

Simplified Pac-Man State Size?

Example: The 8-puzzle States – Locations of tiles 8-puzzle: 181,440 states 15-puzzle: ~10 trillion states 24-puzzle: ~10 25 states Actions – Move blank left, right, up, down Path cost – 1 per move Finding the optimal solution of n-Puzzle is NP-hard

Example: Robot motion planning States – Real-valued joint parameters (angles, displacements) Actions – Continuous motions of robot joints Goal state – Configuration in which object is grasped Path cost – Time to execute, smoothness of path, etc.

Search Given: – Initial state – Actions – Transition model – Goal state – Path cost How do we find the optimal solution? – How about building the state space and then using Dijkstra’s shortest path algorithm? Complexity of Dijkstra’s is O(E + V log V), where V is the size of the state space The state space may be huge!

Search: Basic idea Let’s begin at the start state and expand it by making a list of all possible successor states Maintain a frontier or a list of unexpanded states At each step, pick a state from the frontier to expand Keep going until you reach a goal state Try to expand as few states as possible

Search tree “What if” tree of sequences of actions and outcomes The root node corresponds to the starting state The children of a node correspond to the successor states of that node’s state A path through the tree corresponds to a sequence of actions – A solution is a path ending in a goal state Edges are labeled with actions and costs ……… … Starting state Successor state Action Goal state

Tree Search Algorithm Outline Initialize the frontier using the start state While the frontier is not empty – Choose a frontier node to expand according to search strategy and take it off the frontier – If the node contains the goal state, return solution – Else expand the node and add its children to the frontier

Tree search example Start: Arad Goal: Bucharest

Handling repeated states Initialize the frontier using the starting state While the frontier is not empty – Choose a frontier node to expand according to search strategy and take it off the frontier – If the node contains the goal state, return solution – Else expand the node and add its children to the frontier To handle repeated states: – Keep an explored set; which remembers every expanded node – Every time you expand a node, add that state to the explored set; do not put explored states on the frontier again – Every time you add a node to the frontier, check whether it already exists in the frontier with a higher path cost, and if yes, replace that node with the new one

Search without repeated states Start: Arad Goal: Bucharest

Tree Search Algorithm Outline Initialize the frontier using the starting state While the frontier is not empty – Choose a frontier node to expand according to search strategy and take it off the frontier – If the node contains the goal state, return solution – Else expand the node and add its children to the frontier Main question: Which frontier nodes to explore? Idea: Try to expand as few nodes as possible in finding goal

Agents & Search Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Dan Klein, Stuart Russell, Andrew Moore, Svetlana.

Similar presentations

Presentation on theme: "Agents & Search Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Dan Klein, Stuart Russell, Andrew Moore, Svetlana."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Agents & Search Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Dan Klein, Stuart Russell, Andrew Moore, Svetlana.

Similar presentations

Presentation on theme: "Agents & Search Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Dan Klein, Stuart Russell, Andrew Moore, Svetlana."— Presentation transcript:

Similar presentations

About project

Feedback