Intelligent Environments1 Computer Science and Engineering University of Texas at Arlington
Intelligent Environments2 Decision-Making for Intelligent Environments Motivation Techniques Issues
Intelligent Environments3 Motivation An intelligent environment acquires and applies knowledge about you and your surroundings in order to improve your experience. “acquires” prediction “applies” decision making
Intelligent Environments4 Motivation Why do we need decision-making? “Improve our experience” Usually alternative actions Which one to take? Example (Bob scenario: bedroom ?) Turn on bathroom light? Turn on kitchen light? Turn off bedroom light?
Intelligent Environments5 Example Should I turn on the bathroom light? Issues Inhabitant’s location (current and future) Inhabitant’s task Inhabitant’s preferences Energy efficiency Security Other inhabitants
Intelligent Environments6 Qualities of a Decision Maker Ideal Complete: always makes a decision Correct: decision is always right Natural: knowledge easily expressed Efficient Rational Decisions made to maximize performance
Intelligent Environments7 Agent-based Decision Maker Russell & Norvig “AI: A Modern Approach” Rational agent Agent chooses an action to maximize its performance based on percept sequence
Intelligent Environments8 Agent Types Reflex agent Reflex agent with state Goal-based agent Utility-based agent
Intelligent Environments9 Reflex Agent
Intelligent Environments10 Reflex Agent with State
Intelligent Environments11 Goal-based Agent
Intelligent Environments12 Utility-based Agent
Intelligent Environments13 Intelligent Environments Decision-Making Techniques
Intelligent Environments14 Decision-Making Techniques Logic Planning Decision theory Markov decision process Reinforcement learning
Intelligent Environments15 Logical Decision Making If Equal(?Day,Monday) & GreaterThan(?CurrentTime,0600) & LessThan(?CurrentTime,0700) & Location(Bob,bedroom,?CurrentTime) & Increment(?CurrentTime,?NextTime) Then Location(Bob,bathroom,?NextTime) Query: Location(Bob,?Room,0800)
Intelligent Environments16 Logical Decision Making Rules and facts First-order predicate logic Inference mechanism Deduction: {A, A B} B Systems Prolog (PROgramming in LOGic) OTTER Theorem Prover
Intelligent Environments17 Prolog location(bob,bathroom,NextTime) :- dayofweek(Day), Day = monday, currenttime(CurrentTime), CurrentTime > 0600, CurrentTime < 0700, location(bob,bedroom,CurrentTime), increment(CurrentTime,NextTime). Facts: dayofweek(monday),... Query: location(bob,Room,0800).
Intelligent Environments18 OTTER (all d all t1 all t2 ((DayofWeek(d) & Equal(d,Monday) & CurrentTime(t1) & GreaterThan(t1,0600) & LessThan(t1,0700) & NextTime(t1,t2) & Location(Bob,Bedroom,t1)) -> Location(Bob,Bathroom,t2))). Facts: DayofWeek(Monday),... Query: ( exists r (Location(Bob,r,0800)))
Intelligent Environments19 Actions If Location(Bob,Bathroom,t1) Then Action(TurnOnBathRoomLight,t1) Preferences among actions If RecommendedAction(a1,t1) & RecommendedAction(a2,t1) & ActionPriority(a1) > ActionPriority(a2) Then Action(a1,t1)
Intelligent Environments20 Persistence Over Time If Location(Bob,room1,t1) & not Move(Bob,t1) & NextTime(t1,t2) Then Location(Bob,room1,t2) One for each attribute of Bob!
Intelligent Environments21 Logical Decision Making Assessment Complete? Yes Correct? Yes Efficient? No Natural? No Rational?
Intelligent Environments22 Decision Making as Planning Search for a sequence of actions to achieve some goal Requires Initial state of the environment Goal state Actions (operators) Conditions Effects (implied connection to effectors)
Intelligent Environments23 Example Initial: location(Bob,Bathroom) & light(Bathroom,off) Goal: happy(Bob) Action 1 Condition: location(Bob,?r) & light(?r,on) Effect: Add: happy(Bob) Action 2 Condition: light(?r,off) Effect: Delete: light(?r,off), Add: light(?r,on) Plan: Action 2, Action 1
Intelligent Environments24 Requirements Where do goals come from? System design Users Where do actions come from? Device “drivers” Learned macros E.g., SecureHome action
Intelligent Environments25 Planning Systems UCPOP (Univ. of Washington) Partial Order Planner with Universal quanitification and Conditional effects GraphPlan (CMU) Builds and prunes graph of possible plans
Intelligent Environments26 GraphPlan Example (:action lighton :parameters (?r) :precondition (light ?r off)) :effects (and (light ?r on) (not (light ?r off))))
Intelligent Environments27 Planning Assessment Complete? Yes Correct? Yes Efficient? No Natural? Better Rational?
Intelligent Environments28 Decision Theory Logical and planning approaches typically assume no uncertainty Decision theory = probability theory + utility theory Maximum Expected Utility principle Rational agent chooses actions yielding highest expected utility Averaged over all possible action outcomes Weight utility of an outcome by its probability of occurring
Intelligent Environments29 Probability Theory Random variables: X, Y, … Prior probability: P(X) Conditional probability: P(X|Y) Joint probability distribution P(X 1,…,X n ) is an n-dimensional table of probabilities Complete table allows computation of any probability Complete table typically infeasible
Intelligent Environments30 Probability Theory Bayes rule Example More likely to know P(wet|rain) In general, P(X|Y) = * P(Y|X) * P(Y) chosen so that P(X|Y) = 1
Intelligent Environments31 Bayes Rule (cont.) How to compute P(rain|wet & thunder) P(r | w & t) = P(w & t | r) * P(r) / P(w & t) Know P(w & t | r) possibly, but tedious as evidence increases Conditional independence of evidence Thunder does not cause wet, and vice versa P(r | w & t) = * P(w|r) * P(t|r) * P(r)
Intelligent Environments32 Where Do Probabilities Come From? Statistical sampling Universal principles Individual beliefs
Intelligent Environments33 Representation of Uncertain Knowledge Complete joint probability distribution Conditional probabilities and Bayes rule Assuming conditional independence Belief networks
Intelligent Environments34 Belief Networks Nodes represent random variables Directed link between X and Y implies that X “directly influences” Y Each node has a conditional probability table (CPT) quantifying the effects that the parents (incoming links) have on the node Network is a DAG (no directed cycles)
Intelligent Environments35 Belief Networks: Example
Intelligent Environments36 Belief Networks: Semantics Network represents the joint probability distribution Network encodes conditional independence knowledge Node conditionally independent of all other nodes except parents E.g., MaryCalls and Earthquake are conditionally independent
Intelligent Environments37 Belief Networks: Inference Given network, compute P(Query | Evidence) Evidence obtained from sensory percepts Possible inferences Diagnostic: P(Burglary | JohnCalls) = Causal: P(JohnCalls | Burglary) P(Burglary | Alarm & Earthquake)
Intelligent Environments38 Belief Network Construction Choose variables Discretize continuous variables Order variables from causes to effects CPTs Specify each table entry Define as a function (e.g., sum, Gaussian) Learning Variables (evidential and hidden) Links (causation) CPTs
Intelligent Environments39 Combining Beliefs with Desires Maximum expected utility Rational agent chooses action maximizing expected utility Expected utility EU(A|E) of action A given evidence E EU(A|E) = i P(Result i (A) | E, Do(A)) * U(Result i (A)) Result i (A) are possible outcome states after executing action A U(S) is the agent’s utility for state S Do(A) is the proposition that action A is executed in the current state
Intelligent Environments40 Maximum Expected Utility Assumptions Knowing evidence E completely requires significant sensory information P(Result | E, Do(A)) requires complete causal model of the environment U(Result) requires complete specification of state utilities One-shot vs. sequential decisions
Intelligent Environments41 Utility Theory Any set of preferences over possible outcomes can be expressed by a utility function Lottery L = [p 1,S 1 ; p 2,S 2 ;...; p n,S n ] p i is the probability of possible outcome S i S i can be another lottery Utility principle U(A) > U(B) A preferred to B U(A) = U(B) agent indifferent to A and B Maximum expected utility principle U([p 1,S 1 ; p 2,S 2 ;...; p n,S n ]) = i p i * U(S i )
Intelligent Environments42 Utility Functions Possible outcomes [1.0, $1000; 0.0, $0] [0.5, $3000; 0.5, $0] Expected monetary value $1000 vs. $1500 But depends on value $k S k = state of possessing wealth $k EU(accept) = 0.5 * U(S k +3000) * U(S k ) EU(decline) = U(S k +1000) Will decline for some values of U, accept for others
Intelligent Environments43 Utility Functions (cont.) Studies show U(S k +n) = log 2 n Risk-adverse agents in positive part of curve Risk-seeking agents in negative part of curve
Intelligent Environments44 Decision Networks Also called influence diagrams Decision networks = belief networks + actions and utilities Describes agent’s Current state Possible actions State resulting from agent’s action Utility of resulting state
Intelligent Environments45 Example Decision Network
Intelligent Environments46 Decision Network Chance node (oval) Random variable and CPT Same as belief network node Decision node (rectangle) Can take on a value for each possible action Utility node (diamond) Parents are those chance nodes affecting utility Contains utility function mapping parents to utility value or lottery
Intelligent Environments47 Evaluating Decision Networks Set evidence variables according to current state For each action value of decision node Set value of decision node to action Use belief-net inference to calculate posteriors for parents of utility node Calculate utility for action Return action with highest utility
Intelligent Environments48 Sequential Decision Problems No intermediate utility on the way to the goal Transition model Probability of reaching state j after taking action a in state i Policy = complete mapping from states to actions Want policy maximizing expected utility Computed from transition model and state utilities
Intelligent Environments49 Example P(intended direction) = 0.8 P(right angle to intended) = 0.1 U(sequence) = terminal state’s value - (1/25)*length(sequence)
Intelligent Environments50 Example (cont.) Optimal PolicyUtilities
Intelligent Environments51 Markov Decision Process (MDP) Calculating optimal policy in fully- observable, stochastic environment with known transition model Markov property satisfied depends only on i and not previous states Partially-observable environments addressed by POMDPs
Intelligent Environments52 Value Iteration for MDPs Iterate the following for each state i until little change R(i) is the reward for entering state i for all states except (4,3) and (4,2) +1 for (4,3) -1 for (4,2) Best policy policy*(i) is
Intelligent Environments53 Reinforcement Learning Basically MDP, but learns policy without the need for transition model Q-learning with temporal difference Assigns values Q(a,i) to action-state pairs Utility U(i) = max a Q(a,i) Update Q(a,i) after each observed transition from state i to state j Q(a,i) = Q(a,i) + * (R(i) + max a’ Q(a’,j) - Q(a,i)) action in state i = argmax a Q(a,i)
Intelligent Environments54 Decision-Theoretic Agent Given Percept (sensor) information Maintain Decision network with beliefs, actions and utilities Do Update probabilities for current state Compute outcome probabilities for actions Select action with highest expected utility Return action
Intelligent Environments55 Decision-Theoretic Agent Modeling sensors
Intelligent Environments56 Sensor Modeling Combining evidence from multiple sensors
Intelligent Environments57 Sensor Modeling Detailed model of lane-position sensor
Intelligent Environments58 Dynamic Belief Network (DBN) Reasoning over time Big for lots of states But really only need two slices at a time
Intelligent Environments59 Dynamic Belief Network (DBN)
Intelligent Environments60 DBN for Lane Positioning
Intelligent Environments61 Dynamic Decision Network (DDN)
Intelligent Environments62 DDN-based Agent Capabilities Handles uncertainty Handles unexpected events (no fixed plan) Handles noisy and failed sensors Acts to obtain relevant information Needs Properties from first-order logic DDNs are propositional Goal directedness
Intelligent Environments63 Decision-Theoretic Agent Assessment Complete? No Correct? No Efficient? Better Natural? Yes Rational? Yes
Intelligent Environments64 Netica Decision network simulator Chance nodes Decision nodes Utility nodes Learns probabilities from cases
Intelligent Environments65 Bob Scenario in Netica
Intelligent Environments66 Issues in Decision Making Rational agent design Dynamic decision-theoretic agent Knowledge engineering effort Efficiency vs. completeness Monolithic vs. distributed intelligence Degrees of autonomy