Representing Systems with Hidden State Dorna KASHEF HAGHIGHI, Chris HUNDT *, Prakash PANANGADEN, Joelle PINEAU and Doina PRECUP School of Computer Science,

Slides:



Advertisements
Similar presentations
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Advertisements

Hidden Markov Models (1)  Brief review of discrete time finite Markov Chain  Hidden Markov Model  Examples of HMM in Bioinformatics  Estimations Basic.
CSE-573 Artificial Intelligence Partially-Observable MDPS (POMDPs)
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Dynamic Bayesian Networks (DBNs)
Meeting 3 POMDP (Partial Observability MDP) 資工四 阮鶴鳴 李運寰 Advisor: 李琳山教授.
MDP Presentation CS594 Automated Optimal Decision Making Sohail M Yousof Advanced Artificial Intelligence.
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Statistical NLP: Lecture 11
1 Reasoning Under Uncertainty Over Time CS 486/686: Introduction to Artificial Intelligence Fall 2013.
Planning under Uncertainty
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
POMDPs: Partially Observable Markov Decision Processes Advanced AI
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
CPSC 322, Lecture 11Slide 1 Constraint Satisfaction Problems (CSPs) Introduction Computer Science cpsc322, Lecture 11 (Textbook Chpt 4.0 – 4.2) January,
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
Machine LearningRL1 Reinforcement Learning in Partially Observable Environments Michael L. Littman.
Hidden Markov Model Special case of Dynamic Bayesian network Single (hidden) state variable Single (observed) observation variable Transition probability.
Handling non-determinism and incompleteness. Problems, Solutions, Success Measures: 3 orthogonal dimensions  Incompleteness in the initial state  Un.
Evaluating Hypotheses
Markov Models. Markov Chain A sequence of states: X 1, X 2, X 3, … Usually over time The transition from X t-1 to X t depends only on X t-1 (Markov Property).
1 Error Analysis Part 1 The Basics. 2 Key Concepts Analytical vs. numerical Methods Representation of floating-point numbers Concept of significant digits.
CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.
Learning and Planning for POMDPs Eyal Even-Dar, Tel-Aviv University Sham Kakade, University of Pennsylvania Yishay Mansour, Tel-Aviv University.
CS6800 Advanced Theory of Computation Fall 2012 Vinay B Gavirangaswamy
1 Chapter 8 The Discrete Fourier Transform 2 Introduction  In Chapters 2 and 3 we discussed the representation of sequences and LTI systems in terms.
Predictive State Representation Masoumeh Izadi School of Computer Science McGill University UdeM-McGill Machine Learning Seminar.
Combined Lecture CS621: Artificial Intelligence (lecture 25) CS626/449: Speech-NLP-Web/Topics-in- AI (lecture 26) Pushpak Bhattacharyya Computer Science.
History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA.
Reinforcement Learning
BINF6201/8201 Hidden Markov Models for Sequence Analysis
Compiler Construction Lexical Analysis. The word lexical means textual or verbal or literal. The lexical analysis implemented in the “SCANNER” module.
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.
Conformant Probabilistic Planning via CSPs ICAPS-2003 Nathanael Hyafil & Fahiem Bacchus University of Toronto.
A Logic of Partially Satisfied Constraints Nic Wilson Cork Constraint Computation Centre Computer Science, UCC.
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
CS Statistical Machine learning Lecture 24
4 Proposed Research Projects SmartHome – Encouraging patients with mild cognitive disabilities to use digital memory notebook for activities of daily living.
Hidden Markovian Model. Some Definitions Finite automation is defined by a set of states, and a set of transitions between states that are taken based.
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
Decision Making Under Uncertainty CMSC 471 – Spring 2041 Class #25– Tuesday, April 29 R&N, material from Lise Getoor, Jean-Claude Latombe, and.
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
CPS 170: Artificial Intelligence Markov processes and Hidden Markov Models (HMMs) Instructor: Vincent Conitzer.
Reinforcement Learning Dynamic Programming I Subramanian Ramamoorthy School of Informatics 31 January, 2012.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
1 (Chapter 3 of) Planning and Control in Stochastic Domains with Imperfect Information by Milos Hauskrecht CS594 Automated Decision Making Course Presentation.
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Markov Decision Processes AIMA: 17.1, 17.2 (excluding ), 17.3.
Other Models for Time Series. The Hidden Markov Model (HMM)
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Partial Observability “Planning and acting in partially observable stochastic domains” Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra;
1 Recap lecture 28 Examples of Myhill Nerode theorem, Quotient of a language, examples, Pseudo theorem: Quotient of a language is regular, prefixes of.
Constraint Satisfaction Problems (CSPs) Introduction
PROPERTIES OF REGULAR LANGUAGES
Chapter 6: Temporal Difference Learning
Constraint Satisfaction Problems (CSPs) Introduction
Hidden Markov Models Part 2: Algorithms
CS 188: Artificial Intelligence Fall 2008
October 6, 2011 Dr. Itamar Arel College of Engineering
Chapter 6: Temporal Difference Learning
CMSC 471 – Fall 2011 Class #25 – Tuesday, November 29
CS 416 Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
EGR 2131 Unit 12 Synchronous Sequential Circuits
Instructor: Vincent Conitzer
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Presentation transcript:

Representing Systems with Hidden State Dorna KASHEF HAGHIGHI, Chris HUNDT *, Prakash PANANGADEN, Joelle PINEAU and Doina PRECUP School of Computer Science, McGill University ( * now at UC Berkeley) AAAI Fall Symposium Series November 9, 2007

FSS 2007: Representing Systems with Hidden State How should we represent systems with hidden state? Partially Observable Markov Decision Processes (POMDP) System is in some “true” latent state. Perceive observations that depend probabilistic on the state. Very expressive model, good for state inference and planning, but: –Very hard to learn from data. –Hidden state may be artificial (e.g. dialogue management). Predictive representations (e.g. PSRs, OOMs, TD-nets, diversity) State is defined as sufficient statistic of the past, which allows predicting the future. Good for learning, because state depends only on observable quantities. Our goal: Understand and unify different predictive representations.

FSS 2007: Representing Systems with Hidden State Partially Observable Markov Decision Processes A set of states, S A set of actions, A A set of observations, O A transition function: An observation emission function: For this discussion, we omit rewards (may be considered part of the observation vector.)

FSS 2007: Representing Systems with Hidden State A simple example Consider the following domain: S={s1, s2, s3, s4}, A={N, S, E, W} For simplicity, assume the transitions are deterministic. In each square, the agent observes the color of one of the adjacent walls, O={Red, Blue}, with equal probability. Question: What kinds of predictions can we make about the system?

FSS 2007: Representing Systems with Hidden State A simple example: Future predictions Consider the following predictions: –If I am in state s1 and go North, I will certainly see Blue. –If I go West then North, I will certainly see Blue. –If I go East, I will see Red with probability 0.5. –If I go East then North, I will see Red twice with probability The action sequences are experiments that we can perform on the system. For each experiment, we can verify the predicted observations from data.

FSS 2007: Representing Systems with Hidden State Tests and Experiments A test is a sequence of actions followed by an observation: t = a 1 … a n o, n ≥ 1 An experiment is a non-empty sequence of tests: e = t 1 …. t m, m ≥ 1 –Note that special cases of experiments are s-tests (Littman et al, 2002) and e-tests (Ruddary&Singh, 2004). A prediction for an experiment e starting in s  S, denoted  s|e , is the conditional probability that by doing the actions of e, we will get the predicted observations.

FSS 2007: Representing Systems with Hidden State A simple example: Looking at predictions Consider our predictions again: –If I am in state s1 and go North, I will certainly see Blue.  s1 | NB  = 1 –If I go West then North, I will certainly see Blue.  s | WNB  = 1,  s  S Note that for any sequence of actions preceding the West action, the above prediction would still be the same.

FSS 2007: Representing Systems with Hidden State Equivalence relations Two experiments are equivalent if their predictions are the same for every state: e 1 ~ e 2   s | e 1  =  s | e 2 ,  s Note: If two experiments always give the same results, they are redundant, and only one is necessary. Two states are equivalent if they cannot be distinguished by any experiment: s 1 ~ s 2   s 1 | e  =  s 2 | e ,  e Note: Equivalent states produce the same probability distribution over future trajectories, so they are redundant.

FSS 2007: Representing Systems with Hidden State A simple example: Equivalent predictions Consider the following experiment: NRNR –This is equivalent to : SRSR, NRSR, NNRSSSR, … –This is an infinite equivalence class, which we denote by a chosen exemplar: e.g. [NRNR] –The predictions for this class:  s1 | [NRNR]  = 0  s2 | [NRNR]  = 0.25

FSS 2007: Representing Systems with Hidden State Dual perspectives Forward view: Given a certain state, what predictions can we make about the future? –In classical AI, this view enables forward planning. –It is centered around the notion of state. Backward view: Suppose that we want a certain experiment to succeed, in what state should the system initially be? –This view enables backward planning. –It is centered around the experiments.

FSS 2007: Representing Systems with Hidden State A simple example: Dual perspectives Forward view: Q: If we know that the system is in s1, what predictions can we make about the future?

FSS 2007: Representing Systems with Hidden State A simple example: Dual perspectives Backward view: Q: Suppose we want the experiment NR to succeed, in what state should the system be? A: If the system starts either in state s2 or s4, the test will succeed with probability 0.5. We can associate with the experiment NR a vector of predictions of how likely it is to succeed from every state: [ ] T

FSS 2007: Representing Systems with Hidden State The dual machine The backward view can be implemented in a dual machine. States of the dual machine are equivalence classes of experiments [e]. Observations of the dual machine are states from the original machine. The emission fn represents the prediction probability  s | [e] ,  s  S. The transition fn is deterministic: [e]  a [ae]

FSS 2007: Representing Systems with Hidden State A simple example: A fragment of the dual machine [NR][NB] [WR][ER][WB] N,S, E,W N,S, E,W N,S EE WW N,S, E,W  (s1)=  (s3)=0  (s2)=  (s4)=0.5  (s1)=  (s3)=1  (s2)=  (s4)=0.5  (s) = 1  (s) = 0.5  (s) = 0 This fragment of the dual machine captures experiments with 1 observation. E.g. [NR]  W [WR] because  s | WNR  =  s | WR ,  s. There are separate fragments for experiments with 2 observations, 3 observations, etc. Original:Dual:

FSS 2007: Representing Systems with Hidden State Notes on the dual machine The dual provides, for each experiment, the set of states from which the experiment succeeds. –Note that the emission function is not normalized. –Given an initial state distribution, we can get proper probabilities Pr(s|[e]). Experiments with different numbers of observations usually end up in disconnected components. Arcs represent temporal-difference relations, similar to those in TD-nets (Sutton & Tanner, 2005). –This is consistent with previous observations (Ruddary & Singh, 2004) that e-tests yield TD-relationships and s-tests don’t.

FSS 2007: Representing Systems with Hidden State Can we do this again? In the dual, we get a proper machine, with states, actions, transitions, emissions. Can we think about experiments on the dual machine? –Repeat previous transformations on the dual machine. –Consider classes of equivalent experiments. –Reverse the role of experiments and states. What do we obtain?

FSS 2007: Representing Systems with Hidden State The double dual machine States of the double dual machine are bundles of predictions for all possible experiments, e.g. [s]  ’ and [s  ]  ’ –Equivalence classes of the type [s  ]  ’ can be viewed as homing sequences (Evan-Dar et al., 2005). The double dual assigns the same probability to any experiment as the original machine. So they are equivalent machines. The double dual is always a deterministic system! (But can be much larger than the original machine.)

FSS 2007: Representing Systems with Hidden State A simple example: The double dual machine Original: [NR][NB] [WR][ER][WB] N,S, E,W N,S, E,W N,S EE WW N,S, E,W  (s1)=  (s3)=0  (s2)=  (s4)=0.5  (s1)=  (s3)=1  (s2)=  (s4)=0.5  (s) = 1  (s) = 0.5  (s) = 0 Dual: S1S2 N,S E W  (NR)= 0  (NB)= 1  (ER)= 0.5  (WB)= 1  (WR)= 0 … Double Dual:  (NR)= 0.5  (NB)= 0.5  (ER)= 0.5  (WB)= 1  (WR)= 0 … Equivalent states are eliminated. Two simple homing sequences: Action W forces system into s1. Action E forces system into s2.

FSS 2007: Representing Systems with Hidden State Conjecture: Different representations are useful for different tasks Learn the double-dual –Advantage: it’s deterministic. –Problem: in general, the double-dual is an infinite representation. (In our example, it’s compact due to deterministic transitions in the original.) –Focus on predicting accurately only the result of some experiments. Plan with the dual –For a given experiment, the dual tells us its probability of success from every state. –Given an initial state distribution: search over experiments, to find one with high prediction probability with respect to goal criteria. –Start with dual fragments with short experiments, then move to longer ones.

FSS 2007: Representing Systems with Hidden State A simple learning algorithm Consider the following non-deterministic automaton: A set of states, S A set of actions, A A set of observations, O A joint transition-emission relation: Can we learn this automaton (or an equivalent one) directly from data?

FSS 2007: Representing Systems with Hidden State Merge-split algorithm Define: –Histories:h={a 1, o 1, a 2, o 2, …, a m, o m } –The empty history:  Construct a “history” automaton, H. Algorithm: –Start with one state, corresponding to the empty history, H = {  } –Consider all possible next states, h’ = hao –The merge operation checks for an equivalent existing state: h’ ~ h”  h’  = h” , where h  is the set of all possible future trajectories. If found, we set the transition function accordingly:  (h,ao)=h’’ –Otherwise the split operation is applied:H = H  h’  (h,ao)=h’

FSS 2007: Representing Systems with Hidden State Example The flip automaton (Holmes&Isbell’06) The learned automaton

FSS 2007: Representing Systems with Hidden State Comments Merge-split constructs a deterministic history automaton. There is a finite number of equivalence classes of histories. –Worse-case: size is exponential in the number of states in the original machine. The automaton is well defined (i.e. makes the same predictions as the original model.) This is the minimal such automaton. Extending this to probabilistic machines is somewhat messy…. but we are working on it.

FSS 2007: Representing Systems with Hidden State Final discussion Interesting to consider the same dynamical system from different perspectives. –There is a notion of duality between state and experiment. –Such a notion of duality is not new. E.g. observability vs controllability in systems theory. Large body of existing work on learning automaton, which I did not comment on. [Rivest&Schapire’94; James&Singh’05; Holmes&Isbell’06; …]. Many interesting questions remain: –Can we develop a sound approximation theory for our duality? –Can we extend this to continuous systems? –Can we extend the learning algorithm to probabilistic systems?