Iterative Join Graph Propagation

Slides:



Advertisements
Similar presentations
Verification/constraints workshop, 2006 From AND/OR Search to AND/OR BDDs Rina Dechter Information and Computer Science, UC-Irvine, and Radcliffe Institue.
Advertisements

Tree Clustering for Constraint Networks 1 Chris Reeson Advanced Constraint Processing Fall 2009 By Rina Dechter & Judea Pearl Artificial Intelligence,
Constraint Satisfaction Problems
Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar Slides available online
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Lauritzen-Spiegelhalter Algorithm
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
Statistical Methods in AI/ML Bucket elimination Vibhav Gogate.
MURI Progress Report, June 2001 Advances in Approximate and Hybrid Reasoning for Decision Making Under Uncertainty Rina Dechter UC- Irvine Collaborators:
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
Pearl’s Belief Propagation Algorithm Exact answers from tree-structured Bayesian networks Heavily based on slides by: Tomas Singliar,
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Global Approximate Inference Eran Segal Weizmann Institute.
Approximation Techniques bounded inference COMPSCI 276 Fall 2007.
Belief Propagation, Junction Trees, and Factor Graphs
Tree Decomposition methods Chapter 9 ICS-275 Spring 2007.
Tutorial #9 by Ma’ayan Fishelson
On the Power of Belief Propagation: A Constraint Propagation Perspective Rina Dechter Bozhena Bidyuk Robert Mateescu Emma Rollon.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
AND/OR Search for Mixed Networks #CSP Robert Mateescu ICS280 Spring Current Topics in Graphical Models Professor Rina Dechter.
1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos.
Some Surprises in the Theory of Generalized Belief Propagation Jonathan Yedidia Mitsubishi Electric Research Labs (MERL) Collaborators: Bill Freeman (MIT)
Presented by: Ma’ayan Fishelson. Proposed Projects 1.Performing haplotyping on the input data. 2.Creating a friendly user-interface for the statistical.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
Belief Propagation. What is Belief Propagation (BP)? BP is a specific instance of a general class of methods that exist for approximate inference in Bayes.
Physics Fluctuomatics / Applied Stochastic Process (Tohoku University) 1 Physical Fluctuomatics Applied Stochastic Process 9th Belief propagation Kazuyuki.
Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.
Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",
Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 7th~10th Belief propagation Kazuyuki Tanaka Graduate School of Information Sciences,
Daphne Koller Message Passing Belief Propagation Algorithm Probabilistic Graphical Models Inference.
Approximation Techniques bounded inference 275b. SP22 Mini-buckets: “local inference” The idea is similar to i-consistency: bound the size of recorded.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
Foundations of Constraint Processing, Spring 2009 Structure-Based Methods: An Introduction 1 Foundations of Constraint Processing CSCE421/821, Spring 2009.
1 Tutorial #9 by Ma’ayan Fishelson. 2 Bucket Elimination Algorithm An algorithm for performing inference in a Bayesian network. Similar algorithms can.
Today Graphical Models Representing conditional dependence graphically
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Tree Decomposition methods Chapter 9 ICS-275 Spring 2009.
Structure-Based Methods Foundations of Constraint Processing
StatSense In-Network Probabilistic Inference over Sensor Networks
Constraint Optimization And counting, and enumeration 275 class
Latent Variables, Mixture Models and EM
Bucket Renormalization for Approximate Inference
Structure-Based Methods Foundations of Constraint Processing
Arthur Choi and Adnan Darwiche UCLA
CSCI 5822 Probabilistic Models of Human and Machine Learning
From AND/OR Search to AND/OR BDDs
Generalized Belief Propagation
Structure-Based Methods Foundations of Constraint Processing
Structure-Based Methods Foundations of Constraint Processing
Bucket Renormalization for Approximate Inference
≠ Particle-based Variational Inference for Continuous Systems
Arthur Choi and Adnan Darwiche UCLA
Physical Fluctuomatics 7th~10th Belief propagation
Exploring Partially ordered sets
Expectation-Maximization & Belief Propagation
Lecture 3: Exact Inference in GMs
Clique Tree Algorithm: Computation
Approximating the Partition Function by Deleting and then Correcting for Model Edges Arthur Choi and Adnan Darwiche University of California, Los Angeles.
Mean Field and Variational Methods Loopy Belief Propagation
An Introduction Structure-Based Methods
Structure-Based Methods Foundations of Constraint Processing
Presentation transcript:

Iterative Join Graph Propagation Rina Dechter Kalev Kask Robert Mateescu I am going to talk about Iterative Join Graph Propagation, which is work done by RD, KK and myself UAI 2002 University of California - Irvine

What is IJGP? IJGP is an approximate algorithm for belief updating in Bayesian networks IJGP is a version of join-tree clustering which is both anytime and iterative IJGP applies message passing along a join-graph, rather than a join-tree Empirical evaluation shows that IJGP is almost always superior to other approximate schemes (IBP, MC) Probabilistic reasoning using Bayesian betworks is known to be a hard problem. Although exact algorithms are available for solving this task, they are most of the time infeasible in practice, due to the large size and connectivity of real life networks. Approximate schemes are therefore of crucial importance. We propose here a new algorithm, called IJGP - iterative join graph propagation, which applies message passing along a join-graph. It is both anytime and iterative, providing the user adjustable tradeoffs between accuracy and time. The empirical evaluation we conducted was extremely encouraging, showing that IJGP was better than other existing approximate algorithms on several classes of problems.

Outline IBP - Iterative Belief Propagation MC - Mini Clustering IJGP - Iterative Join Graph Propagation Empirical evaluation Conclusion Our new algorithm, IJGP, builds on two existing approximate schemes: IBP, which is an iterative algorightm, on one hand, and a recently introduced scheme, Mini-Clustering, which is an anytime algorithm on the other hand. We will quickly review the main characteristics of these two algorithms and then present IJGP.

Iterative Belief Propagation - IBP Belief propagation is exact for poly-trees IBP - applying BP iteratively to cyclic networks No guarantees for convergence Works well for many coding networks U1 U2 U3 One step: update BEL(U1) X1 X2

p(d|b), h(1,2)(b,c), p(f|c,d) Mini-Clustering - MC Cluster Tree Elimination Mini-Clustering, i=3 A B C p(a), p(b|a), p(c|a,b) 1 BC B C D C D F p(d|b), h(1,2)(b,c) p(f|c,d) B C D F p(d|b), h(1,2)(b,c), p(f|c,d) B C D F p(d|b), p(f|c,d) 2 2 BF sep(2,3) = {B,F} elim(2,3) = {C,D} A B B E F p(e|b,f) 3 C D E EF F E F G p(g|e,f) 4 G

CTE vs. MC 1 1 2 2 3 3 4 4 Cluster Tree Elimination ABC ABC 1 1 BC BC 2 BCDF 2 BCDF BF BF 3 BEF 3 BEF EF EF 4 EFG 4 EFG Cluster Tree Elimination Mini-Clustering, i=3

IJGP - Motivation IBP is applied to a loopy network iteratively not an anytime algorithm when it converges, it converges very fast MC applies bounded inference along a tree decomposition MC is an anytime algorithm controlled by i-bound IJGP combines: the iterative feature of IBP the anytime feature of MC

IJGP - The basic idea Apply Cluster Tree Elimination to any join-graph We commit to graphs that are minimal I-maps Avoid cycles as long as I-mapness is not violated Result: use minimal arc-labeled join-graphs

IJGP - Example A B C A C A ABC C A AB BC C D E ABDE BCE BE C DE CE F CDEF G H F FGH H H F FG GH H I J GI FGI GHIJ a) Belief network a) The graph IBP works on

Arc-minimal join-graph ABC C A ABC C A AB BC C AB BC ABDE BCE ABDE BCE BE C C DE CE DE CE CDEF CDEF H F FGH H FGH H H F F FG GH H FG GH GI GI FGI GHIJ FGI GHIJ

Minimal arc-labeled join-graph ABC C A ABC C AB BC AB BC ABDE BCE ABDE BCE C C DE CE DE CE CDEF CDEF FGH H H FGH H H F F FG GH F GH GI GI FGI GHIJ FGI GHIJ

Join-graph decompositions ABC C AB BC BC BC ABDE BCE ABCDE BCE ABCDE BCE C DE CE CDE CE DE CE CDEF CDEF CDEF FGH H H FGH FGH F F F F GH F GH F GH GI GI GI FGI GHIJ FGI GHIJ FGI GHIJ a) Minimal arc-labeled join graph b) Join-graph obtained by collapsing nodes of graph a) c) Minimal arc-labeled join graph

Tree decomposition BC ABCDE BCE ABCDE DE CE CDE CDEF CDEF FGH F F F GH GI GHI FGI GHIJ FGHI GHIJ a) Minimal arc-labeled join graph a) Tree decomposition

Join-graphs more accuracy less complexity A ABDE FGI ABC BCE GHIJ CDEF FGH C H AB BC BE DE CE F FG GH GI A ABDE FGI ABC BCE GHIJ CDEF FGH C H AB BC DE CE F GH GI ABCDE FGI BCE GHIJ CDEF FGH BC DE CE F GH GI ABCDE FGHI GHIJ CDEF CDE F GHI more accuracy less complexity

Minimal arc-labeled decomposition BC BC ABCDE BCE ABCDE BCE CDE CE DE CE CDEF CDEF a) Fragment of an arc-labeled join-graph a) Shrinking labels to make it a minimal arc-labeled join-graph Use a DFS algorithm to eliminate cycles relative to each variable

p(a), p(c), p(b|ac), p(d|abe),p(e|b,c) h(3,1)(bc) Message propagation ABCDE BC BCE ABCDE p(a), p(c), p(b|ac), p(d|abe),p(e|b,c) h(3,1)(bc) h(3,1)(bc) CDE 1 BCD 3 CE BC CDEF FGH h(1,2) CDE CE F F GH 2 CDEF GI FGI GHIJ Minimal arc-labeled: sep(1,2)={D,E} elim(1,2)={A,B,C} Non-minimal arc-labeled: sep(1,2)={C,D,E} elim(1,2)={A,B}

Bounded decompositions We want arc-labeled decompositions such that: the cluster size (internal width) is bounded by i (the accuracy parameter) the width of the decomposition as a graph (external width) is as small as possible Possible approaches to build decompositions: partition-based algorithms - inspired by the mini-bucket decomposition grouping-based algorithms

Partition-based algorithms F C D B A G: (GFE) E: (EBF) (EF) F: (FCD) (BF) D: (DB) (CD) C: (CAB) (CB) B: (BA) (AB) (B) A: (A) GFE P(G|F,E) EF EBF P(E|B,F) P(F|C,D) F BF FCD BF CD CDB P(D|B) CB B CAB P(C|A,B) BA BA P(B|A) A A P(A) a) schematic mini-bucket(i), i=3 b) arc-labeled join-graph decomposition

IJGP properties IJGP(i) applies BP to min arc-labeled join-graph, whose cluster size is bounded by i On join-trees IJGP finds exact beliefs IJGP is a Generalized Belief Propagation algorithm (Yedidia, Freeman, Weiss 2001) Complexity of one iteration: time: O(deg•(n+N) •d i+1) space: O(N•d)

Empirical evaluation Algorithms: Measures: Exact IBP MC IJGP Measures: Absolute error Relative error Kullback-Leibler (KL) distance Bit Error Rate Time Networks (all variables are binary): Random networks Grid networks (MxM) CPCS 54, 360, 422 Coding networks

Random networks - KL at convergence This is a comment evidence=0 evidence=5

Random networks - KL vs. iterations evidence=0 evidence=5

Random networks - Time

Grid 81 – KL distance at convergence

Grid 81 – KL distance vs. iterations

Grid 81 – Time vs. i-bound

N=400, P=4, 500 instances, 30 iterations, w*=43 Coding networks N=400, P=4, 500 instances, 30 iterations, w*=43

Coding networks - BER sigma=.22 sigma=.32 sigma=.51 sigma=.65

Coding networks - Time

CPCS 422 – KL distance evidence=0 evidence=30

CPCS 422 – KL vs. iterations evidence=0 evidence=30

CPCS 422 – Relative error evidence=30

CPCS 422 – Time vs. i-bound evidence=0

Conclusion IJGP borrows the iterative feature from IBP and the anytime virtues of bounded inference from MC Empirical evaluation showed the potential of IJGP, which improves with iteration and most of the time with i-bound, and scales up to large networks IJGP is almost always superior, often by a high margin, to IBP and MC Based on all our experiments, we think that IJGP provides a practical breakthrough to the task of belief updating