Probabilistic Plan Recognition Kathryn Blackmond Laskey Department of Systems Engineering and Operations Research George Mason University Dagstuhl Seminar.

Slides:



Advertisements
Similar presentations
Slide 1 of 18 Uncertainty Representation and Reasoning with MEBN/PR-OWL Kathryn Blackmond Laskey Paulo C. G. da Costa The Volgenau School of Information.
Advertisements

Scaling Up Graphical Model Inference
A Tutorial on Learning with Bayesian Networks
Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Exact Inference in Bayes Nets
Dynamic Bayesian Networks (DBNs)
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Introduction of Probabilistic Reasoning and Bayesian Networks
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Markov Networks.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Plan Recognition with Multi- Entity Bayesian Networks Kathryn Blackmond Laskey Department of Systems Engineering and Operations Research George Mason University.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
PR-OWL: A Framework for Probabilistic Ontologies by Paulo C. G. COSTA, Kathryn B. LASKEY George Mason University presented by Thomas Packer 1PR-OWL.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Bayesian Reinforcement Learning with Gaussian Processes Huanren Zhang Electrical and Computer Engineering Purdue University.
Conditional Random Fields
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.
Bayesian Networks Alan Ritter.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Computer vision: models, learning and inference
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Particle Filtering in Network Tomography
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Markov Logic And other SRL Approaches
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
Randomized Algorithms for Bayesian Hierarchical Clustering
Simultaneously Learning and Filtering Juan F. Mancilla-Caceres CS498EA - Fall 2011 Some slides from Connecting Learning and Logic, Eyal Amir 2006.
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.
CS Statistical Machine learning Lecture 24
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
CPSC 322, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 30, 2015 Slide source: from David Page (MIT) (which were.
Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.
Indexing Correlated Probabilistic Databases Bhargav Kanagal, Amol Deshpande University of Maryland, College Park, USA SIGMOD Presented.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Wei Sun and KC Chang George Mason University March 2008 Convergence Study of Message Passing In Arbitrary Continuous Bayesian.
John Lafferty Andrew McCallum Fernando Pereira
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
Pattern Recognition and Machine Learning
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Probabilistic Reasoning Inference and Relational Bayesian Networks.
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.
CSCI 5822 Probabilistic Models of Human and Machine Learning
Instructors: Fei Fang (This Lecture) and Dave Touretzky
Bayesian Statistics and Belief Networks
Particle Filtering.
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence Fall 2008
Expectation-Maximization & Belief Propagation
Machine Learning: Lecture 6
Machine Learning: UNIT-3 CHAPTER-1
Markov Networks.
Presentation transcript:

Probabilistic Plan Recognition Kathryn Blackmond Laskey Department of Systems Engineering and Operations Research George Mason University Dagstuhl Seminar April 2011

1 The problem of plan recognition is to take as input a sequence of actions performed by an actor and to infer the goal pursued by the actor and also to organize the action sequence in terms of a plan structure Schmidt, Sridharan and Goodson, 1978 …the problem of plan recognition is largely a problem of inference under conditions of uncertainty. Charniak and Goldman, 1993

2 PPR in a Nutshell Represent set of possible plans anticipated evidence for each plan Specify prior probabilities for plans likelihood for evidence given plans Infer plans using Bayes Rule Bayes, Thomas. An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53: , Thomas Bayes ( ) …or just directly specify P(plan|obs)

3 Why Probability? Theoretically well-founded representation for relative plausibility of competing explanations Unified approach to inference and learning Combine engineered and learned knowledge Many general-purpose exact and approximate algorithms with strong theoretical justification and practical success Good results on many interesting problems But… –Inference and learning (exact and approximate) are NP hard –Balancing tractability and expressiveness is a major research and engineering challenge

4 Representing Plans and Observations Plan recognition requires a computational representation of possible plans and observable evidence –Goals –Actions When executed in combination, actions are expected (with high probability) to achieve the goal –Preconditions / postconditions of actions –Constraints Most notably, temporal ordering –Observables Actions may or may not be directly observable Sometimes we observe effects of actions – Hierarchical decomposition of the above For probabilistic plan recognition, we need to assign probabilities to these elements –Balance expressivity against tractability of inference & learning

5 Some Representations for PPR Bayesian networks Hidden Markov Models / Dynamic Bayesian Networks Plan Recognition Bayesian Networks / Probabilistic Relational Models / Multi-Entity Bayesian Networks Bayesian Abductive Logic Programs Stochastic Grammars Conditional Random Fields Markov Logic Networks Each of these formalisms can be thought of as a way of representing a set of “possible worlds” and defining a probability measure on an algebra of subsets

6 Graphical Probability Models Factorize joint distribution into factors involving only a few variables each –Graph represents conditional independence assumptions –Local distributions specify probability information for small groups of related variables –Factors are combined into joint distribution Drastically simplifies specification, inference and learning 20 possible goals, 100 possible actions –Fully general model  2.5x10 31 probabilities –“Naïve Bayes model”  19x20x100=38,000 probabilities –If each goal has only 10 associated actions then “naïve Bayes model”  19x10 = 190 probabilities –Naïve Bayes inference scales as #variables x #states/variable Naïve Bayes Model

7 Bayesian Network (BN) Directed graph represents dependencies Joint distribution factors as Factored representation makes specification, inference and learning tractable for interesting classes of problems Directed graph naturally represents causality –Effects of intervention via “do” operator –Explaining away 127 probabilities  14 probabilities Pr(R,E,I,W,T,B,S) = Pr(R)Pr(E)Pr(I|R)Pr(W|R)Pr(T|E,I)Pr(B|W)Pr(S|W)

8 Possible and Probable Worlds “Traditional or deductive logic admits only three attitudes to any proposition: definite proof, disproof, or blank ignorance.” (Jeffreys) Semantics of classical logic is based on possible worlds –Set of possible worlds defined by language, domain, and axioms –In propositional logic, possible worlds assign truth values to atoms (e.g., R  T; W  T; E  F) Probability theory –Set of possible worlds is called the sample space –Probability measure maps subsets to real numbers –Probability axioms are a natural extension of classical propositional logic to likelihood BN combines propositional logic with probability

9 Other Factored Representations Markov network: factorization specified by undirected graph –More natural for domains without natural causal direction –Joint distribution factorizes as: Chain graph: factorization specified by graph with both directed and undirected edges Representations to exploit context-specific independence –Probability trees –Tree-structured parameterization for local distributions in a BN C indexes cliques in the graph x iC is i th variable in clique C k C is size of clique C Z is a normalization constant

10 Conditional Random Fields Bayesian networks are generative models –Represent joint probability over plans and observations –Realistic dependence models often yield intractable inference Conditional (or discriminative) model directly represents probability of plans given observations –Can allow some dependencies to be relaxed CRFs are discriminative –Undirected graph represents local dependencies –Potential function represents strength of dependence A CRF is a family of MRFs (a mapping from observations to potentials)

11 Inference in Graphical Models Exact inference –E.g., Belief propagation, junction tree, bucket elimination, symbolic probabilistic inference, cutset conditioning –Exploit graph structure / factorization to simplify computation –Infeasible for complex problems Approximate (deterministic) –E.g., Loopy BP, variational Bayes Approximate (stochastic) –E.g., Gibbs sampling, Metropolis-Hastings sampling, likelihood weighting Combinations –E.g., Bidyuk and Dechter (2007) – cutset sampling

12 Belief Propagation for Singly Connected BNs Goal: compute probability distribution of random variable B given evidence (assume B itself is not known) Key idea: impact of belief in B from evidence "above" B and evidence "below" B can be processed separately Justification: B d-separates “above” random variables from “below” random variables = evidence random variable A1 A2A3A6 D5 B D6 D1 D2 A4A5 D7 D3 D4 ? Random variables “above” B Random variables “below” B   This picture depicts the updating process for one node. The algorithm simultaneously updates beliefs for all the nodes. Loopy BP applies BP to network with loops; often results in good approximation

13 Likelihood Weighting (for BNs) 1.Proceed through non-evidence variables in order consistent with partial ordering induced by graph –Sample variable according to its local probability distribution –Calculate weight proportional to Pr(evidence | sampled values) 2.Repeat Step 1 until done 3.Estimate Pr(Variable=value) by weighted sample frequency

14 Junction Tree Algorithm 1.Compile BN into junction tree –Tree of clusters of nodes –Has JT property: variable belonging to 2 clusters must belong to all clusters along path connecting them –Becomes part of the knowledge representation –Changes only if the graph changes 2.Use local local message- passing algorithm to propagate beliefs in the junction tree 3.Query on any node or any set of nodes in same cluster can be computed from cluster joint distribution FGJK JKL ABC DEGHJ BCDEH CDGEH DFGHJ

15 Gibbs Sampling 1.Initialize –Evidence variables assigned to observed values –Arbitrary value for other variables 2.Sample non-evidence nodes one at a time: –Sample with probability Pr(variable | Markov blanket) –Replace with newly sampled value 3.Repeat Step 2 until done 4.Estimate Pr(Variable=value) by sample frequency Markov blanket -In BN: parents, children, co-parents -In MN: neighbors Variable is conditionally independent of rest of network given its Markov blanket

16 Cutset Sampling (for BNs) Find a loop cutset Initialize cutset variables Do until done –Propagate beliefs on non-cutset variables –Do Gibbs iteration on cutset Estimate P(Variable=value) by averaging probability over samples This is a kind of “Rao- Blackwellization” –Reduce variance of Monte Carlo estimator by replacing a sampling step with an exact computation with same expected value

17 Variational Inference Method for approximating posterior distribution of unobserved variables given observed variables Approximation finds distribution in family with simpler functional form (e.g., remove some arcs in graph) by minimizing a measure of distance from true posterior Estimation via “variational EM” –Alternate between “expectation” and “maximization” steps –Converges to local minimum of distance function –Yields lower bound for marginal likelihood Often faster but less accurate than MC

18 Extending Expressive Power of BNs Charniak and Goldman (1993) Propositional logic + probability is insufficiently expressive for requirements of plan recognition –Repeated structure –Multiple interrelated entities (e.g., plans, actors, actions) –Type hierarchy and inheritance –Unbounded number of potentially relevant variables Some formalisms with greater expressive power: –PBN (Plan recognition BN) –PRM (Probabilistic Relational Models) –OOBN (Object-Oriented Bayesian Networks) –MEBN (Multi-Entity BN) –Plates –BALP (Bayesian Abductive Logic Programs)

19 Example: Maritime Domain Awareness Entities, attributes and relations

20 MDA Probabilistic Ontology Built in UnBBayes-MEBN

21 MDA SSBN Screenshot of situation-specific BN in UnBBayes-MEBN (open-source tool for building & reasoning with PR-OWL ontologies)

22 Protégé Plugin for UnBBayes

23 Drag-and-Drop Mapping drag-and-drop

24 Markov Logic Networks First-order knowledge base with weight attached to formulas and clauses KB + individual constants  ground Markov network containing variable for each grounding of a formula in the KB Compact language for specifying large Markov networks

25 MLN Example (Richardson and Domingos, 2006)

26 CRFs for Chat Recognition (Hsu, Lian and Jih, 2011) Subscript indexes pairs of individuals Y i t represents chatting activity of pair X i t represents observed acoustic features Dependence structure: –Within-pair temporal dependence –Between-pair concurrent dependence Can be represented as MLN

27 Possible and Probable FO Worlds In first-order logic, a possible world (aka “structure”) assigns: –Each constant symbol to a domain element (e.g., go3  obj 23 ) –Each n-ary function symbol to a function on n-tuples of domain elements (e.g., (go-stp pln1)  obj 23 –Each n-ary relation symbol to a set of n-tuples of domain elements (e.g., inst  {(obj23, go-), (obj 78, liquor-store), (obj 78, store) … } A first-order probabilistic logic assigns a probability measure to first-order structures –This is called “measure model” semantics (Gaifman,1964)

28 FOL + Probability: Issues Probability zero ≠ unsatisfiable –E.g., every possible value of a continuous distribution has probability zero FOL is undecidable; FOL + probability is not even semi- decidable –Example: IID sequence of coin tosses, 0 < P(H) < 1 Given any finite sequence of prior tosses, both H and T are possible We cannot disprove any non-extreme probability distribution from a finite sequence of tosses Wrong solution: “We will prevent you from expressing this query because we cannot tractably compute the answer.” Better solution: “Represent the problem you really want to solve, and then figure out a way to approximate the answer.” –Think carefully about what the real problem is!

29 Knowledge Based Model Construction KBMC system contains : –Base representation that represents goals, plans, actions, actors, observables, constraints, etc. –Model construction procedure that maps a context and/or query into a target model At problem solving time –Construct a problem-specific Bayesian network –Process queries on constructed model using general-purpose BN algorithm Advantages of expressive representation –Understandability –Maintainability –Knowledge reuse –Exploit repeated structure (representation, inference, learning) –Construct only as much of model as needed for query

30 Hypothesis Management Constructed BN rapidly becomes intractable, especially in presence of existence and association uncertainty What do we really need to represent? Heuristics help to avoid constructing (or prune) very unlikely hypotheses (or variables with very weak impact on conclusions) –E.g., from only “John went to the airport” do not nominate hypothesis that John intends to set off a bomb –But a security system needs to be on the alert for prospective bombers!

31 Lifted Inference Constructed BN (propositionalized theory) typically contains repeated structure Applying standard BN inference often results in many repetitions of the identical computation Lifted inference algorithms detect such repetitions –“Lift” problem from ground to first-order level –Perform computation only once Very active area of research (Braz, et al., 2005)

32 Learning = Inference … in theory, at least i=1,…,N j=1,…,M Plate model for parameter learning of store-of local distribution

33 Representing Temporal Evolution Plans evolve in time HMM / DBN / PDBN replicate variables describing temporally evolving situation Hidden Markov Model (HMM) unobservable evolving state + observable indicator Dynamic Bayesian Network (DBN) factored representation of state / observable Partially Dynamic Bayesian Network (PDBN) some variables not time-dependent

34 DBN Inference Any BN inference algorithm can be applied to a finite- horizon DBN Special-case inference algorithms exploit DBN structure –“Rollup” algorithm marginalizes out past hidden states given past observations to explicitly represent only a sliding window –Viterbi algorithm finds most probable values of hidden states given observations –Forward-backward algorithm estimates marginal distributions for hidden states given observations Exact inference is generally intractable –Factored frontier algorithm approximates marginalization of past hidden state for intractable DBNs –Particle filter is a temporal variant of likelihood weighting with resampling Beware of static nodes!

35 Resampling Particle Filter initialization Likelihood weighting Resampling Likelihood weighting Evolution Maintains sample of weighted particles Each particle is a single realization of all non-evidence nodes Particle is weighted by likelihood of observation given particle Particles are resampled with probability proportional to weight From van der Merwe et al. (undated)

36 Particle Impoverishment Particles with large weights are sampled more often, leading to low particle diversity This effect is counteracted by “spreading” effects of process noise Impoverishment is very serious when: –Observations are extremely unlikely –Low “process noise” leads to long dwell times in widely separated basins of attraction “In fact, for the case of very small process noise, all particles will collapse to a single point within a few iterations.” “If the process noise is zero, then using a particle filter is not entirely appropriate.” (Arulampulam et al., 2002)

37 Particle Filter with Static Nodes PF cannot recover from impoverishment of static node Some approaches: –Estimate separate PF for each combination of static node Only if static node has small state space –Regularized PF - artificial evolution of static node Ad hoc; no justification for amount of perturbation; information loss over time –Shrinkage (Liu & West) Combines ideas from artificial evolution & kernel smoothing Perturbation “shrinks” static node for each particle toward weighted sample mean –Perturbation holds variance of set of particles constant –Correlation in disturbances compensates for information loss –Resample-Move (Gilks & Berzuini) Metropolis-Hastings step corrects for particle impoverishment MH sampling of static node involves entire trajectory but is performed less frequently as runs become longer             X

38 Stochastic Grammars Motivation: find representation that is sufficiently expressive for plan recognition but more tractable than general DBN inference A stochastic grammar is a set of stochastic production rules for generating sequences of actions (terminal symbols in the grammar) Modularity of production rules yields factored joint distribution

39 Stochastic Grammar - Example Taken from Geib and Goldman (2009) Plans are represented as and/or tree with temporal constraints

40 Stochastic Grammar - Inference Parsing algorithms can be applied to compute restricted class of queries –If plans can be represented in a given formalism then that formalism’s inference algorithms can be applied to process queries –We are often interested in a broader class of queries than traditional parsing algorithms can handle (e.g., we usually have not observed all actions) Parse tree can be converted to DBN –Enables answering a broader class of queries –Can exploit structure of grammar to improve tractability of inference Special-purpose algorithms exploit grammar structure

41 Where Do We Stand? Contributions of probabilistic methods –Useful way of thinking about problems –Unified approach to reasoning, parameter learning, structure learning –Principled combination of KE with learning –Can learn from small, moderate and large samples –Many general-purpose exact and approximate algorithms with strong theoretical justification and practical success –Good results (better than previous state of the art) on many interesting problems Many challenging problems remain –Exact learning and inference are intractable –High-dimensional multi-modal distributions are just plain ugly All inference algorithms break down on the toughest cases –Asymptotics doesn’t mean much when the long run is millions of years! –With good engineering backed by solid theory, we will continue to make progress

42 Bibliography (1 of 2) Arulampalam, M. Maskell, S., Gordon, N. and Clapp, T. A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking, IEEE Transactions on Signal Processing, 50, pp. 174–188, Bidyuk, B. and Dechter, R. "Cutset Sampling for Bayesian Networks", Journal of Artificial Intelligence Research 28, pages 1- 48, Braz, R., Amir, E. and Roth, D. Lifted First-Order Probabilistic Inference. Proceedings of the International Joint Conference on Artificial Intelligence, Bui, H., Venkatesh, S. and West, G. "Policy Recognition in the Abstract Hidden Markov Model", Artificial Intelligence, Journal of Artificial Intelligence Research, Volume 17, pages , Charniak, E. and Goldman, R. A Bayesian Model of Plan Recognition. Artificial Intelligence, 64: 53-79, Charniak, E. and Goldman, R. A Probabilistic Model of Plan Recognition. Proceedings of the Ninth Conference on Artificial Intelligence Darwiche, A. Modeling and Reasoning with Bayesian Networks. Cambridge University Press Gaifman, H. Concerning measures in First-Order calculi. Israel Journal of Mathematics, 2, 1–18, Geib, C.W. and Goldman, R.P. A Probabilistic Plan Recognition Algorithm Based on Plan Tree Grammars. Artificial Intelligence 173, pp. 1101–1132, Gilks, W.R. and Berzuini, C. Following a Moving Target—Monte Carlo Inference for Dynamic Bayesian Models,” Journal of the Royal Statistical Society B, 63, pp. 127–146, Hsu, J., Lian, C., and Jih, W. Probabilistic Models for Concurrent Chatting Activity Recognition. ACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 1, Jensen, F., Bayesian Networks and Decision Graphs (2nd edition). Springer, Korb, K. and Nicholson, A. Bayesian Artificial Intelligence. Chapman and Hall, Koller, D., Friedman, N. Probabilistic Graphical Models. MIT Press, Lafferty, J., McCallum, A., and Pereira, F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning, Laskey, K.B., MEBN: A Language for First-Order Bayesian Knowledge Bases, Artificial Intelligence, 172(2-3): , 2008

43 Bibliography (2 of 2) Liao, L. Patterson, D. J. Fox, D. and Kautz, H. Learning and Inferring Transportation Routines. Artificial Intelligence, Liao, L., Fox, D., AND Kautz, H. Hierarchical Conditional Random Fields for GPS-based Activity Recognition. In Springer Tracts in Advanced Robotics. Springer, Liu, J. and West, M., Combined Parameter and State Estimation in Simulation-Based Filtering,” in Sequential Monte Carlo Methods in Practice, A. Doucet, J. F. G. de Freitas, and N. J. Gordon, Eds. New York: Springer-Verlag, Musso, C. Oudjane, N and LeGland, F. Improving Regularised Particle Filters, in Sequential Monte Carlo Methods in Practice, A. Doucet, J. F. G. de Freitas, and N. J. Gordon, Eds. New York: Springer-Verlag, Neapolitan, R. Learning Bayesian Networks. Prentice Hall, Pearl, J. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, Pynadath, D.V. and Wellman, M.P. Probabilistic State-Dependent Grammars for Plan Recognition. Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, Pynadath, D. V. and Wellman, M. P. Generalized queries on probabilistic context-free grammars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):65–77, Richardson, M. and Domingos, P., Markov Logic Networks. Machine Learning, 62, , Schmidt, C., Sridharan, N., and Goodson, J., The plan recognition problem: An Intersection of psychology and Artificial Intelligence, Artificial Intelligence 11 pp. 45–83, van der Merwe, R., Doucet, A., de Freitas, N. and Wan, E. The Unscented Particle Filter, Adv. Neural Inform. Process. Syst van der Merwe, R., Doucet, A., de Freitas, N. and Wan, E. (undated) “The unscented particle filter,” Wellman, M.P., J.S. Breese, and R.P. Goldman (1992) From knowledge bases to decision models. The Knowledge Engineering Review, 7(1):35-53.