Download presentation
Presentation is loading. Please wait.
Published byTyler Norman Modified over 9 years ago
1
I NTELLIGENT A GENTS 1
2
D EFINITION OF A GENT Anything that: Perceives its environment Acts upon its environment A.k.a. controller, robot 2
3
D EFINITION OF “E NVIRONMENT ” The real world, or a virtual world Rules of math/formal logic Rules of a game … Specific to the problem domain 3
4
4 Environment ? Agent Percepts Actions Actuators Sensors
5
5 Environment ? Agent Percepts Actions Actuators Sensors Sense – Plan – Act
6
“G OOD ” B EHAVIOR Performance measure (aka reward, merit, cost, loss, error) Part of the problem domain 6
7
E XERCISE Formulate the problem domains for: Tic-tac-toe A web server An insect A student in B551 A doctor diagnosing a patient IU’s basketball team The U.S.A. 7 What is/are the: Environment Percepts Actions Performance measure How might a “good- behaving” agent process information?
8
T YPES OF AGENTS Simple reflex (aka reactive, rule-based) Model-based Goal-based Utility-based (aka decision-theoretic, game- theoretic) Learning (aka adaptive) 8
9
S IMPLE R EFLEX 9 Percept Action Rules Interpreter State
10
S IMPLE R EFLEX 10 Percept Action Rules
11
S IMPLE R EFLEX 11 Percept Action Rules In observable environment, percept = state
12
R ULE - BASED R EFLEX A GENT 12 AB if DIRTY = TRUE then SUCK else if LOCATION = A then RIGHT else if LOCATION = B then LEFT
13
B UILDING A S IMPLE R EFLEX A GENT Rules: a map from states to action a = (s) Can be: Designed by hand Learned from a “teacher” (e.g., human expert) using ML techniques 13
14
M ODEL -B ASED R EFLEX 14 Percept Action Rules Interpreter State Action
15
M ODEL -B ASED R EFLEX 15 Percept Action Rules Model State Action
16
M ODEL -B ASED R EFLEX 16 Percept Action Rules Model State Action State estimation
17
M ODEL -B ASED A GENT 17 AB Rules: if LOCATION = A then if HAS-SEEN(B) = FALSE then RIGHT else if HOW-DIRTY(A) > HOW-DIRTY(B) then SUCK else RIGHT … State: LOCATION HOW-DIRTY(A) HOW-DIRTY(B) HAS-SEEN(A) HAS-SEEN(B) Model: HOW-DIRTY(LOCATION) = X HAS-SEEN(LOCATION) = TRUE
18
M ODEL -B ASED R EFLEX A GENTS Controllers in cars, airplanes, factories Robot obstacle avoidance, visual servoing 18
19
B UILDING A M ODEL - BASED R EFLEX A GENT A model is a map from prior state s, action a, to new state s’ s’ = T(s,a) Can be Constructed through domain knowledge (e.g., rules of a game, state machine of a computer program, a physics simulator for a robot) Learned from watching the system behave (system identification, calibration) Rules can be designed or learned as before 19
20
G OAL -B ASED, U TILITY -B ASED 20 Percept Action Rules Model State Action
21
G OAL -B ASED, U TILITY -B ASED 21 Percept Action Decision Mechanism Model State Action
22
G OAL -B ASED, U TILITY -B ASED 22 State Decision Mechanism Action Model Simulated State Action Generator Performance tester Best Action Percept Model
23
G OAL -B ASED, U TILITY -B ASED 23 State Decision Mechanism Action Model Simulated State Action Generator Performance tester Best Action Sensor Model “Every good regulator of a system must be a model of that system”
24
B UILDING A G OAL OR U TILITY - BASED A GENT Requires: Model of percepts (sensor model) Action generation algorithm (planner) Embedded state update model into planner Performance metric Model of percepts can be learned (sensor calibration) or approximated by hand (e.g., simulation) 24
25
B UILDING A G OAL -B ASED A GENT Requires: Model of percepts (sensor model) Action generation algorithm (planner) Embedded state update model into planner Performance metric Model of percepts can be learned (sensor calibration) or approximated by hand (e.g., simulation) Planning using search Performance metric: does it reach the goal? 25
26
B UILDING A U TILITY -B ASED A GENT Requires: Model of percepts (sensor model) Action generation algorithm (planner) Embedded state update model into planner Performance metric Model of percepts can be learned (sensor calibration) or approximated by hand (e.g., simulation) Planning using decision theory Performance metric: acquire maximum rewards (or minimum cost) 26
27
B IG O PEN Q UESTIONS : G OAL -B ASED A GENT = R EFLEX A GENT ? 27 Percept Action DM Rules Model State ActionMental Action Mental State Mental Model Physical Environment “Mental Environment”
28
B IG O PEN Q UESTIONS : G OAL -B ASED A GENT = R EFLEX A GENT ? 28 Percept Action DM Rules Model State ActionMental Action Mental State Mental Model Physical Environment “Mental Environment”
29
W ITH L EARNING 29 Percept Action Decision Mechanism Model/Learning Action State/Model/DM specs
30
B UILDING A L EARNING A GENT Need a mechanism for updating models/rules/planners on-line as it interacts with the environment Need incremental techniques for machine learning More next week… 30
31
B IG O PEN Q UESTIONS : L EARNING A GENTS The modeling, learning, and decision mechanisms of artificial agents are tailored for specific tasks Are there general mechanisms for learning? If not, what are the limitations of the human brain? 31
32
T YPES OF E NVIRONMENTS Observable / non-observable Deterministic / nondeterministic Episodic / non-episodic Single-agent / Multi-agent 32
33
O BSERVABLE E NVIRONMENTS 33 Percept Action Decision Mechanism Model State Action
34
O BSERVABLE E NVIRONMENTS 34 State Action Decision Mechanism Model State Action
35
O BSERVABLE E NVIRONMENTS 35 State Action Decision Mechanism Action
36
N ONDETERMINISTIC E NVIRONMENTS 36 Percept Action Decision Mechanism Model State Action
37
N ONDETERMINISTIC E NVIRONMENTS 37 Percept Action Decision Mechanism Model Set of States Action
38
A GENTS IN THE BIGGER PICTURE Binds disparate fields (Econ, Cog Sci, OR, Control theory) Framework for technical components of AI Decision making with search Machine learning Casting problems in the framework sometimes brings insights Search Knowledge rep. Planning Reasoning Learning Agent Robotics Perception Natural language... Expert Systems Constraint satisfaction
39
U PCOMING T OPICS Utility and decision theory (R&N 17.1-4) Reinforcement learning Project midterm report due next week (11/10) 39
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.