4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510;

Slides:



Advertisements
Similar presentations
A Decision-Theoretic Model of Assistance - Evaluation, Extension and Open Problems Sriraam Natarajan, Kshitij Judah, Prasad Tadepalli and Alan Fern School.
Advertisements

Bayesian Belief Propagation
Opportunity Knocks: A Community Navigation Aid Henry Kautz Don Patterson Dieter Fox Lin Liao University of Washington Computer Science & Engineering.
State Estimation and Kalman Filtering CS B659 Spring 2013 Kris Hauser.
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
CSE-573 Artificial Intelligence Partially-Observable MDPS (POMDPs)
Dynamic Bayesian Networks (DBNs)
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.
1 Assisted Cognition Henry Kautz, Oren Etzioni, & Dieter Fox University of Washington Department of Computer Science & Engineering.
Advanced Artificial Intelligence
1 Reasoning Under Uncertainty Over Time CS 486/686: Introduction to Artificial Intelligence Fall 2013.
Planning under Uncertainty
10/28 Temporal Probabilistic Models. Temporal (Sequential) Process A temporal process is the evolution of system state over time Often the system state.
CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Part 2 of 3: Bayesian Network and Dynamic Bayesian Network.
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.
11/14  Continuation of Time & Change in Probabilistic Reasoning Project 4 progress? Grade Anxiety? Make-up Class  On Monday?  On Wednesday?
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5) March, 27, 2009.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
CPSC 422, Lecture 14Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14 Feb, 4, 2015 Slide credit: some slides adapted from Stuart.
Bayesian Filtering for Location Estimation D. Fox, J. Hightower, L. Liao, D. Schulz, and G. Borriello Presented by: Honggang Zhang.
CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
Exploration in Reinforcement Learning Jeremy Wyatt Intelligent Robotics Lab School of Computer Science University of Birmingham, UK
Bayesian Filtering for Robot Localization
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Markov Localization & Bayes Filtering
Recap: Reasoning Over Time  Stationary Markov models  Hidden Markov models X2X2 X1X1 X3X3 X4X4 rainsun X5X5 X2X2 E1E1 X1X1 X3X3 X4X4 E2E2 E3E3.
Inferring High-Level Behavior from Low-Level Sensors Don Peterson, Lin Liao, Dieter Fox, Henry Kautz Published in UBICOMP 2003 ICS 280.
1 Prasad Tadepalli Intelligent assistive systems Infer the goals of the human users and offer timely help; applications to assistance, tutoring; Learning.
1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.
CSE-573 Reinforcement Learning POMDPs. Planning What action next? PerceptsActions Environment Static vs. Dynamic Fully vs. Partially Observable Perfect.
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5.2) Nov, 25, 2013.
Learning and Inferring Transportation Routines By: Lin Liao, Dieter Fox and Henry Kautz Best Paper award AAAI’04.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Simultaneously Learning and Filtering Juan F. Mancilla-Caceres CS498EA - Fall 2011 Some slides from Connecting Learning and Logic, Eyal Amir 2006.
CS Statistical Machine learning Lecture 24
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
QUIZ!!  In HMMs...  T/F:... the emissions are hidden. FALSE  T/F:... observations are independent given no evidence. FALSE  T/F:... each variable X.
Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.
Inferring High-Level Behavior from Low-Level Sensors Donald J. Patterson, Lin Liao, Dieter Fox, and Henry Kautz.
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
CPS 170: Artificial Intelligence Markov processes and Hidden Markov Models (HMMs) Instructor: Vincent Conitzer.
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Learning and Inferring Transportation Routines Lin Liao, Don Patterson, Dieter Fox, Henry Kautz Department of Computer Science and Engineering University.
Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.
CS 541: Artificial Intelligence Lecture VIII: Temporal Probability Models.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Sequential Stochastic Models
Probabilistic reasoning over time
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Artificial Intelligence
Markov ó Kalman Filter Localization
Course: Autonomous Machine Learning
CAP 5636 – Advanced Artificial Intelligence
CS 188: Artificial Intelligence Spring 2007
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
CS 188: Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
Probabilistic reasoning over time
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Presentation transcript:

4/22: Unexpected Hanging and other sadistic pleasures of teaching  Today: Probabilistic Plan Recognition  Tomorrow: Web Service Composition (BY 510; 11AM)  Thursday: Continual Planning for Printers (in-class)  Tuesday 4/29: (Interactive) Review

Oregon State University Approaches to plan recognition  Consistency-based  Hypothesize & revise  Closed-world reasoning  Version spaces  Probabilistic  Stochastic grammars  Pending sets  Dynamic Bayes nets  Layered hidden Markov models  Policy recognition  Hierarchical hidden semi-Markov models  Dynamic probabilistic relational models  Example application: Assisted Cognition Can be complementary.. First pick the consistent plans, and check which of them is most likely (tricky if the agent can make errors)

Oregon State University Agenda (as actually realized in class)  Plan recognition as probabilistic (max weight) parsing  On the connection between dynamic bayes nets and plan recognition; with a detour on the special inference tasks for DBN  Examples of plan recognition techniques based on setting up DBNs and doing MPE inference on them  Discussion of Decision Theoretic Assistance paper

Stochastic grammars Huber, Durfee, & Wellman, "The Automated Mapping of Plans for Plan Recognition", 1994The Automated Mapping of Plans for Plan Recognition Darnell Moore and Irfan Essa, "Recognizing Multitasked Activities from Video using Stochastic Context-Free Grammar", AAAI-02, Darnell Moore and Irfan Essa, "Recognizing Multitasked Activities from Video using Stochastic Context-Free Grammar", CF grammar w/ probabilistic rules Chart parsing + Viterbi Successful for highly structured tasks (e.g. playing cards) Problems: errors, context

Probabilistic State-dependent grammars

Connection with DBNs

Time and Change in Probabilistic Reasoning

Temporal (Sequential) Process A temporal process is the evolution of system state over time Often the system state is hidden, and we need to reconstruct the state from the observations Relation to Planning: –When you are observing a temporal process, you are observing the execution trace of someone else’s plan…

Dynamic Bayes Networks are “templates” for specifying the relation between the values of a random variable across time-slices  e.g. How is Rain at time t related to Rain at time t+1? We call them templates because they need to be expanded (unfolded) to the required number of time steps to reason about the connection between variables at different time points

Normal LW takes each sample through the network one by one Idea 1: Take them all from t to t+1 lock-step  the samples are the distribution Normal LW doesn’t do well when the evidence is downstream (the sample weight will be too small) In DBN, none of the evidence is affecting the sampling! EVEN MORE of an issue

Special Cases of DBNs are well known in the literature Restrict number of variables per state –Markov Chain: DBN with one variable that is fully observable –Hidden Markov Model: DBN with only one state variable that is hidden and can be estimated through evidence variable(s) Restrict the type of CPD –Kalman Filters: DBN where the system transition function as well as the observation variable are linear gaussian The advantage of Gaussians is that the posterior distribution remains Gaussian

Plan Recognition Approaches based on setting up DBNs

Dynamic Bayes nets (I) E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users. Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, July 1998.The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users. Towards a Bayesian model for keyhole plan recognition in large domains Albrecht, Zukermann, Nicholson, Bud Towards a Bayesian model for keyhole plan recognition in large domains Models relationship between user’s recent actions and goals (help needs) Probabilistic goal persistence Programming in machine language?

Excel help (partial)

Cognitive mode { normal, error } Dynamic Bayesian Nets GPS reading Edge, velocity, position Data (edge) association Transportation mode Trip segment Goal z k-1 zkzk x k-1 xkxk  k-1 kk Time k-1 Time k m k-1 mkmk t k-1 tktk g k-1 gkgk c k-1 ckck Learning and Inferring Transportation Routines Lin Liao, Dieter Fox, and Henry Kautz, Nineteenth National Conference on Artificial Intelligence, San Jose, CA, 2004.

Pending sets A new model of plan recognition. Goldman, Geib, and Miller A new model of plan recognition. Probabilistic plan recognition for hostile agents. Geib, Goldman Probabilistic plan recognition for hostile agents. Explicitly models the agent’s “plan agenda” using Poole’s “probabilistic Horn abduction” rules Handles multiple concurrent interleaved plans & negative evidence Number of different possible pending sets can grow exponentially Context problematic? Metric time? Happen(X,T+1)  Pending(P,T), X in P, Pick(X,P,T+1). Pending(P’,T+1)  Pending(P,T), Leaves(L), Progress(L, P, P’, T+1).

Layered hidden Markov models N. Oliver, E. Horvitz, and A. Garg. Layered Representations for Recognizing Office Activity, Proceedings of the Fourth IEEE International Conference on Multimodal Interaction (ICMI 2002)Layered Representations for Recognizing Office Activity Cascade of HMM’s, operating at different temporal granularities Inferential output at layer K is “evidence” for layer K+1

Policy recognition Tracking and Surveillance in Wide-Area Spatial Environments Using the Hidden Markov Model. Hung H. Bui, Svetha Venkatesh and West. Tracking and Surveillance in Wide-Area Spatial Environments Using the Hidden Markov Model. Bui, H. H., Venkatesh, S., and West, G. (2000) On the recognition of abstract Markov policies. Seventeenth National Conference on Artificial Intelligence (AAAI-2000), Austin, TexasOn the recognition of abstract Markov policies. Model agent using hierarchy of abstract policies (e.g. abstract by spatial decomposition) Compute the conditional probability of top-level policy given observations Compiled into DBN

Hierarchical hidden semi- Markov models Combine hierarchy (function call semantics) with metric time Compile to DBN Time nodes represent a distribution over the time of the next state “switch” “Linear time” smoothing Research issues – parametric time nodes, varying granularity Hidden semi-Markov models (segment models) Kevin Murphy. November Hidden semi-Markov models (segment models) HSSM: Theory into Practice, Deibel & Kautz, forthcoming.

Dynamic probabilistic relational models Friedman, N., L. Getoor, D. Koller, A. Pfeffer. Learning Probabilistic Relational Models. IJCAI-99, Stockholm, Sweden (July 1999).Learning Probabilistic Relational Models Relational Markov Models and their Application to Adaptive Web Navigation, Anderson, Domingos, Weld Relational Markov Models and their Application to Adaptive Web Navigation Dynamic probabilistic relational models, Anderson, Domingos, Weld, forthcoming. PRM - reasons about classes of objects and relations Lattice of classes can capture plan abstraction DPRM – efficient approximate inference by Rao- Blackwellized particle filtering Open: approximate smoothing?

Assisted cognition Understanding human behavior from low-level sensory data Using commonsense knowledge Learning individual user models Actively offering prompts and other forms of help as needed Alerting human caregivers when necessary Computer systems that improve the independence and safety of people suffering from cognitive limitations by…

Activity Compass Zero-configuration personal guidance system Learns model of user’s travel on foot, by public transit, by bike, by car Predicts user’s next destination, offers proactive help if lost or late Integrates user data with external constraints Maps, bus schedules, calendars, … EM approach to clustering & segmenting data The Activity CompassThe Activity Compass Don Patterson, Oren Etzioni, and Henry Kautz (2003)

Activity of daily living monitor & prompter Foundations of Assisted Cognition Systems.Foundations of Assisted Cognition Systems. Kautz, Etzioni, Fox, Weld, and Shastri, 2003

Recognizing unexpected events using online model selection User errors, abnormal behavior Select model that maximizes likelihood of data: Generic model User-specific model Corrupt (impaired) user model Neurologically-plausible corruptions Repetition Substitution Stalling Fox, Kautz, & Shastri (forthcoming) fill kettle put kettle on stove fill kettle put kettle on stove put kettle in closet

Decision-Theoretic Assistance Don’t just recognize! Jump in and help.. Allows us to also talk about POMDPs

Oregon State University Intelligent Assistants  Many examples of AI techniques being applied to assistive technologies  Intelligent Desktop Assistants  Calendar Apprentice (CAP) (Mitchell et al. 1994)  Travel Assistant (Ambite et al. 2002)  CALO Project  Tasktracer  Electric Elves (Hans Chalupsky et al. 2001)  Assistive Technologies for the Disabled  COACH System (Boger et al. 2005)

Oregon State University Not So Intelligent  Most previous work uses problem-specific, hand-crafted solutions  Lack ability to offer assistance in ways not planned for by designer  Our goal: provide a general, formal framework for intelligent-assistant design  Desirable properties:  Explicitly reason about models of the world and user to provide flexible assistance  Handle uncertainty about the world and user  Handle variable costs of user and assistive actions  We describe a model-based decision-theoretic framework that captures these properties

Oregon State University An Episodic Interaction Model User Assistant W6W6 W7W7 W8W8 W9W9 Goal Achieved W2W2 User Action W4W4 W5W5 W3W3 Assistant Actions W1W1 Goal Initial State Each user and assistant action has a cost Action set U Action set A Objective: minimize expected cost of episodes

Oregon State University Example: Grid World Domain World states: (x,y) location and door status Possible goals: Get wood, gold, or food User actions: Up, Down, Left, Right, noop Open a door in current room (all actions have cost = 1) Assistant actions: Open a door, noop (all actions have cost = 0)

Oregon State University World and User Models UtUt WtWt AtAt W t+1 G P(G) P(U t | G, W t ) W1W1 W2W2 W3W3 W4W4 U1U1 A1A1 U2U2 ? Model world dynamics as a Markov decision process (MDP) Model user as a stochastic policy P(W t+1 | W t, U t, A t ) Goal Distribution Action distribution conditioned on goal and world state Transition Model Given: model, action sequence Output: assistant action

Oregon State University Optimal Solution: Assistant POMDP UtUt WtWt AtAt W t+1 G P(G) P(U t | G, W t ) P(W t+1 | W t, U t, A t ) Goal Distribution Action distribution conditioned on goal and world state Transition Model  Can view as a POMDP called the assistant POMDP  Hidden State: user goal  Observations: user actions and world states  Optimal policy gives mapping from observation sequences to assistant actions  Represents optimal assistant  Typically intractable to solve exactly

Oregon State University Approximate Solution Approach Goal RecognizerAction Selection Environment User UtUt AtAt OtOt P(G) Assistant WtWt  Online actions selection cycle 1) Estimate posterior goal distribution given observation 2) Action selection via myopic heuristics

Oregon State University Approximate Solution Approach Goal RecognizerAction Selection Environment User UtUt AtAt OtOt P(G) Assistant WtWt  Online actions selection cycle 1) Estimate posterior goal distribution given observation 2) Action selection via myopic heuristics

Oregon State University Goal Estimation WtWt Current State P(G | O t ) Goal posterior given observations up to time t W t+1 UtUt P(G | O t+1 ) Updated goal posterior new observation  Given  P(G | O t ) : goal posterior at time t initally equal to prior P(G)  P(U t | G, W t ) : stochastic user policy  O t+1 : new observation of user action and world state it is straightforward to update goal posterior at time t+1 must learn user policy

Oregon State University Learning User Policy  Use Bayesian updates to update user policy P(U|G, W) after each episode  Problem: can converge slowing, leading to poor goal estimation  Solution: use strong prior on user policy derived via planning  Assume that user behaves “nearly rational”  Take prior distribution on P(U|G, W) to be bias toward optimal user actions  Let Q(U,W,G) be value of user taking action U in state W given goal G  Can compute via MDP planning  Use prior P(U | G, W) α exp(Q(U,W,G))

Oregon State University Q(U,W,G) for Grid World

Oregon State University Approximate Solution Approach Goal RecognizerAction Selection Environment User UtUt AtAt OtOt P(G) Assistant WtWt  Online actions selection cycle 1) Estimate posterior goal distribution given observation 2) Action selection via myopic heuristics

Oregon State University Action Selection: Assistant POMDP At’At’ WtWt W t+1 W t+2 U G At’At’ WtWt Assistant MDP  Assume we know the user goal G and policy  Can create a corresponding assistant MDP over assistant actions  Can compute Q(A, W, G) giving value of taking assistive action A when users goal is G  Select action that maximizes expected (myopic) value: If you just want to recognize, you only need P(G|O t ) If you just want to help (and know the goal), you just need Q(A,W,G)

Oregon State University Experiments: Grid World Domain

Oregon State University Experiments: Kitchen Domain

Oregon State University Experimental Results  Experiment: 12 human subjects, two domains  Subjects were asked to achieve a sequence of goals  Compared average cost of performing tasks with assistant to optimal cost without assistant  Assistant reduced cost by over 50%

Oregon State University Summary of Assumptions  Model Assumptions:  World can be approximately modeled as MDP  User and assistant interleave actions (no parallel activity)  User can be modeled as a stationary, stochastic policy  Finite set of known goals  Assumptions Made by Solution Approach  Access to practical algorithm for solving the world MDP  User does no reason about the existence of the assistance  Goal set is relatively small and known to assistant  User is close to “rational”

While DBNs are special cases of B.N.’s there are a certain inference tasks that are particularly frequently useful for them (Notice that all of them involve estimating posterior probability distributions—as is done in any B.N. inference)

Can do much better if we exploit the repetitive structure Both Exact and Approximate B.N. Inference methods can be made to take the temporal structure into account.  Specialized variable-elimination method  Unfold t+1 th level, and roll-up t th level by variable elimination  Specialized Likelihood-weighting methods that take evidence into account  Particle Filtering Techniques

Can do much better if we exploit the repetitive structure Both Exact and Approximate B.N. Inference methods can be made to take the temporal structure into account.  Specialized variable-elimination method  Unfold t+1 th level, and roll-up t th level by variable elimination  Specialized Likelihood-weighting methods that take evidence into account  Particle Filtering Techniques

Class Ended here.. Slides beyond this not discussed

Belief States If we have k state variables, 2 k states A “belief state” is a probability distribution over states –Non-deterministic We just know the states for which the probability is non- zero 2 2^k belief states –Stochastic We know the probability distribution over the states Infinite number of probability distributions –A complete state is a special case of belief state where the distribution is “dirac-delta” i.e., non-zero only for one state In blocks world, Suppose we have blocks A and B and they can be “clear”, “on-table” “On” each other -A state: A is on table, B is on table, both are clear, hand is empty -A belief state : A is either on B or on Table B is on table. Hand is empty  2 states in the belief state

Actions and Belief States Two types of actions –Standard actions: Modify the distribution of belief states Doing “C on A” action in the belief state gives us a new belief state (with C on A on B OR C on A; B clear) Doing “Shake-the-Table” action converts the previous belief state to (A on table; B on Table; A clear; B clear) –Notice that actions reduce the uncertainty! Sensing actions –Sensing actions observe some aspect of the belief state –The observations modify the belief state distribution In the belief state above, if we observed that two blocks are clear, then the belief state changes to {A on table; B on table; both clear} If the observation above is noisy (i.e, we are not completely certain), then the probability distribution just changes so more probability mass is centered on the {A on table; B on Table} state. A belief state : A is either on B or on Table B is on table. Hand is empty

Actions and Belief States Two types of actions –Standard actions: Modify the distribution of belief states Doing “C on A” action in the belief state gives us a new belief state (with C on A on B OR C on A; B clear) Doing “Shake-the-Table” action converts the previous belief state to (A on table; B on Table; A clear; B clear) –Notice that actions reduce the uncertainty! Sensing actions –Sensing actions observe some aspect of the belief state –The observations modify the belief state distribution In the belief state above, if we observed that two blocks are clear, then the belief state changes to {A on table; B on table; both clear} If the observation above is noisy (i.e, we are not completely certain), then the probability distribution just changes so more probability mass is centered on the {A on table; B on Table} state. A belief state : A is either on B or on Table B is on table. Hand is empty