Download presentation
Presentation is loading. Please wait.
Published byBernard Parks Modified over 8 years ago
1
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011
2
Last Time Time and uncertainty Inference: filtering, prediction, smoothing Hidden Markov Models (HMMs) –Model –Exact Reasoning Dynamic Bayesian Networks –Model –Exact Reasoning
3
Inference Tasks Filtering: –Belief state: probability of state given the evidence Prediction: –Like filtering without evidence Smoothing: –Better estimate of past states Most likelihood explanation: –Scenario that explains the evidence
4
Filtering (forward algorithm) Predict: Update : Recursive step E t-1 E t+1 X t-1 XtXt X t+1 EtEt
5
Smoothing Forwardbackward
6
Most Likely Explanation Finding most likely path E t-1 E t+1 X t-1 XtXt X t+1 EtEt Called Viterbi
7
Today Dynamic Bayesian Networks –Exact Inference –Approximate Inference
8
Dynamic Bayesian Network DBN is like a 2time-BN –Using the first order Markov assumptions Standard BN Time 0Time 1
9
Dynamic Bayesian Network Basic idea: –Copy state and evidence for each time step –Xt: set of unobservable (hidden) variables (e.g.: Pos, Vel) –Et: set of observable (evidence) variables (e.g.: Sens.A, Sens.B) Notice: Time is discrete
10
Example
11
Inference in DBN Unroll: Inference in the above BN Not efficient (depends on the sequence length)
12
Exact Inference in DBNs Variable Elimination: –Add slice t+1, sum out slice t using variable elimination x 1 (0) x 1 (3) x 1 (2) x 1 (1) X 2 (0) X 2 (3) X 2 (2) X 2 (1) X 3 (0) X 3 (3) X 3 (2) X 3 (1) X 4 (0) X 4 (3) X 4 (2) X 4 (1) No conditional independence after few steps
13
s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5s1s4s3s2s5 Exact Inference in DBNs Variable Elimination: –Add slice t+1, sum out slice t using variable elimination
14
Variable Elimination s4s3s2s5 s4s3s2s5 s4s3s2s5 s4s3s2s5
15
Variable Elimination s4s3s5 s4s3s5 s4s3s5 s4s3s5
16
Variable Elimination s4s5 s4s5 s4s5 s4s5
17
DBN Representation: DelC TtTt LtLt CR t RHC t T t+1 L t+1 CR t+1 RHC t+1 f CR (L t, CR t, RHC t, CR t+1 ) f T (T t, T t+1 ) L CR RHC CR (t+1) CR (t+1) O T T 0.2 0.8 E T T 1.0 0.0 O F T 0.0 1.0 E F T 0.0 1.0 O T F 1.0 0.1 E T F 1.0 0.0 O F F 0.0 1.0 E F F 0.0 1.0 T T (t+1) T (t+1) T 0.91 0.09 F 0.0 1.0 RHM t RHM t+1 MtMt M t+1 f RHM (RHM t, RHM t+1 ) RHM R (t+1) R (t+1) T 1.0 0.0 F 0.0 1.0
18
Benefits of DBN Representation Pr (Rm t+1,M t+1,T t+1, L t+1,C t+1, Rc t+1 | Rm t,M t,T t, L t,C t, Rc t ) = f Rm (Rm t, Rm t+1 ) * f M (M t, M t+1 ) * f T (T t, T t+1 ) * f L (L t, L t+1 ) * f Cr (L t, Cr t, Rc t, Cr t+1 ) * f Rc (Rc t, Rc t+1 ) - Only few parameters vs. 25440 for matrix -Removes global exponential dependence s 1 s 2... s 160 s 1 0.9 0.05... 0.0 s 2 0.0 0.20... 0.1 s 160 0.1 0.0... 0.0...... TtTt LtLt CR t RHC t T t+1 L t+1 CR t+1 RHC t+1 RHM t RHM t+1 MtMt M t+1
19
DBN Myth Bayesian Network: a decomposed structure to represent the full joint distribution Does it imply easy decomposition for the belief state? No!
20
Tractable, approximate representation Exact inference in DBN is intractable Need approximation –Maintain an approximate belief state –E.g. assume Gaussian processes Boyen-Koller approximation: –Factored belief state
21
Idea Use a decomposable representation for the belief state (pre-assume some independency)
22
Problem What about the approximation errors? –It might accumulate and grow unbounded…
23
Contraction property Main properties of B-K approximation: –Under reasonable assumptions about the stochasticity of the process, every state transition results in a contraction of the distance between the two distributions by a constant factor –Since approximation errors from previous steps decrease exponentially, the overall error remains bounded indefinitely
24
Basic framework Definition 1: –Prior belief state: –Posterior belief state: Monitoring task:
25
Simple contraction Distance measure: –Relative entropy (KL-divergence) between the actual and the approximate belief state Contraction due to O: Contraction due to T (can we do better?):
26
Simple contraction (cont) Definition: –Minimal mixing rate: Theorem 3 (the single process contraction theorem): –For process Q, anterior distributions φ and ψ, ulterior distributions φ ’ and ψ ’,
27
Simple contraction (cont) Proof Intuition:
28
Compound processes Mixing rate could be very small for large processes The trick is to assume some independence among subprocesses and factor the DBN along these subprocesses Fully independent subprocesses: –Theorem 5 of [BK98]: For L independent subprocesses T 1, …, T L. Let γ l be the mixing rate for T l and let γ = min l γ l. Let φ and ψ be distributions over S 1 (t), …, S L (t), and assume that ψ renders the S l (t) marginally independent. Then:
29
Compound processes (cont) Conditionally independent subprocesses Theorem 6 of [BK98]: –For L independent subprocesses T 1, …, T L, assume each process depends on at most r others, and each influences at most q others. Let γ l be the mixing rate for T l and let γ = min l γ l. Let φ and ψ be distributions over S 1 (t), …, S L (t), and assume that ψ renders the S l (t) marginally independent. Then:
30
Efficient, approximate monitoring If each approximation incurs an error bounded by ε, then –Total error =>error remains bounded Conditioning on observations might introduce momentary errors, but the expected error will contract
31
Approximate DBN monitoring Algorithm (based on standard clique tree inference): 1.Construct a clique tree from the 2-TBN 2.Initialize clique tree with conditional probabilities from CPTs of the DBN 3.For each time step: a.Create a working copy of the tree Y. Create σ (t+1). b.For each subprocess l, incorporate the marginal σ (t) [X (t) l ] in the appropriate factor in Y. c.Incorporate evidence r (t+1) in Y. d.Calibrate the potentials in Y. e.For each l, query Y for marginal over X l (t+1) and store it in σ (t+1).
32
Solution: BK algorithm With mixing, bounded projection error: total error is bounded Exact step Approximation/ marginalization step Break into smaller clusters
33
Boyen-Koller Approximation Example of variational inference with DBNs Computer posterior for time t from (factored) state estimate at time t-1 –Assume posterior has factored form Error is bounded
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.