On the Power of Belief Propagation: A Constraint Propagation Perspective Rina Dechter Bozhena Bidyuk Robert Mateescu Emma Rollon.

Slides:

Advertisements

Similar presentations

Bayesian Belief Propagation

Advertisements

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.

Exact Inference in Bayes Nets

ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.

For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.

For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.

Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.

Pearl’s Belief Propagation Algorithm Exact answers from tree-structured Bayesian networks Heavily based on slides by: Tomas Singliar,

Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.

Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.

Belief Propagation on Markov Random Fields Aggeliki Tsoli.

Bayesian network inference

1 Exact Inference Algorithms Bucket-elimination and more COMPSCI 179, Spring 2010 Set 8: Rina Dechter (Reading: chapter 14, Russell and Norvig.

Global Approximate Inference Eran Segal Weizmann Institute.

Propagation in Poly Trees Given a Bayesian Network BN = {G, JDP} JDP(a,b,c,d,e) = p(a)*p(b|a)*p(c|e,b)*p(d)*p(e|d) a d b e c.

Belief Propagation, Junction Trees, and Factor Graphs

1 Consistency algorithms Chapter 3. Spring 2007 ICS 275A - Constraint Networks 2 Consistency methods Approximation of inference: Arc, path and i-consistecy.

SampleSearch: A scheme that searches for Consistent Samples Vibhav Gogate and Rina Dechter University of California, Irvine USA.

Stochastic greedy local search Chapter 7 ICS-275 Spring 2007.

. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.

Bayes Nets. Bayes Nets Quick Intro Topic of much current research Models dependence/independence in probability distributions Graph based - aka “graphical.

Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.

AND/OR Search for Mixed Networks #CSP Robert Mateescu ICS280 Spring Current Topics in Graphical Models Professor Rina Dechter.

Mean Field Inference in Dependency Networks: An Empirical Study Daniel Lowd and Arash Shamaei University of Oregon.

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.

1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh.

For Wednesday Read Chapter 11, sections 1-2 Program 2 due.

Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))

Direct Message Passing for Hybrid Bayesian Networks Wei Sun, PhD Assistant Research Professor SFL, C4I Center, SEOR Dept. George Mason University, 2009.

Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website

Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.

Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.

Approximation Techniques bounded inference 275b. SP22 Mini-buckets: “local inference” The idea is similar to i-consistency: bound the size of recorded.

1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:

Stochastic greedy local search Chapter 7 ICS-275 Spring 2009.

Dependency Networks for Collaborative Filtering and Data Visualization UAI-2000 발표 : 황규백.

Lecture 2: Statistical learning primer for biologists

Belief Propagation and its Generalizations Shane Oldenburger.

Wei Sun and KC Chang George Mason University March 2008 Convergence Study of Message Passing In Arbitrary Continuous Bayesian.

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

Inference Algorithms for Bayes Networks

Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.

Pattern Recognition and Machine Learning

1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:

Distributed cooperation and coordination using the Max-Sum algorithm

Perfect recall: Every decision node observes all earlier decision nodes and their parents (along a “temporal” order) Sum-max-sum rule (dynamical programming):

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning Feb 2008.

ED-BP: Belief Propagation via Edge Deletion UCLA Automated Reasoning Group Arthur Choi, Adnan Darwiche, Glen Lenker, Knot Pipatsrisawat Last updated 07/15/2010:

Many-Pairs Mutual Information for Adding Structure to Belief Propagation Approximations Arthur Choi and Adnan Darwiche University of California, Los Angeles.

Extending Expectation Propagation for Graphical Models

Bucket Renormalization for Approximate Inference

People Forecasting Where people are going?

Arthur Choi and Adnan Darwiche UCLA

Generalized Belief Propagation

Bucket Renormalization for Approximate Inference

Class #19 – Tuesday, November 3

Graduate School of Information Sciences, Tohoku University

Arthur Choi and Adnan Darwiche UCLA

Expectation-Maximization & Belief Propagation

Extending Expectation Propagation for Graphical Models

Lecture 3: Exact Inference in GMs

Approximating the Partition Function by Deleting and then Correcting for Model Edges Arthur Choi and Adnan Darwiche University of California, Los Angeles.

Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website

BP in Practice Message Passing Inference Probabilistic Graphical

Mean Field and Variational Methods Loopy Belief Propagation

Iterative Join Graph Propagation

Presentation transcript:

On the Power of Belief Propagation: A Constraint Propagation Perspective Rina Dechter Bozhena Bidyuk Robert Mateescu Emma Rollon

Distributed Belief Propagation

How many people?

Distributed Belief Propagation Causal support Diagnostic support

Belief Propagation in Polytrees

BP on Loopy Graphs Pearl (1988): use of BP to loopy networks McEliece, et. Al 1988: IBP’s success on coding networks Lots of research into convergence … and accuracy (?), but: Why IBP works well for coding networks Can we characterize other good problem classes Can we have any guarantees on accuracy (even if converges)

Arc-consistency Sound Incomplete Always converges (polynomial) AB CD A B D C < < < = A < B A < D D < C B = C

A ABABACAC ABDBCF DFG B B D F A A A C 1 AP(A) …0 ABP(B|A) ……0 ABDP(D|A,B) ………0 DFGP(G|D,F) ………0 BCFP(F|B,C) ………0 ACP(C|A) ……0 A AB ABD DFG BCF AC A ABABACAC ABDBCF DFG B B D F A A A C 1 Belief network Flat constraint network Flattening the Bayesian Network

ABP(B|A) 12> ……0 AB Ah 1 2 (A) 1>0 2 3 …0 Bh 1 2 (B) 1>0 2 3 …0 Bh 1 2 (B) 1>0 3 …0 A B B 1 3 Updated belief: Updated relation: AB Bel (A,B) 13> ……0 AB A ABAC ABDBCF DFG B B D F A A A C 1 Belief Zero Propagation = Arc-Consistency

AP(A) …0 ACP(C|A) ……0 ABP(B|A) ……0 BCFP(F|B,C) ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 Flat Network - Example

AP(A) 1>0 3 …0 ACP(C|A) ……0 ABP(B|A) > ……0 BCFP(F|B,C) ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) 2131 ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 IBP Example – Iteration 1

AP(A) 1>0 3 …0 ACP(C|A) ……0 ABP(B|A) ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) 2131 ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 IBP Example – Iteration 2

AP(A) 1>0 3 …0 ACP(C|A) ……0 ABP(B|A) 131 ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) 2131 ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 IBP Example – Iteration 3

AP(A) 11 …0 ACP(C|A) ……0 ABP(B|A) 131 ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) 1321 ………0 DFGP(G|D,F) 2131 ………0 IBP Example – Iteration 4 A ABABACAC ABDBCF DFG B B D F A A A C 1

AP(A) 11 …0 ACP(C|A) 121 ……0 ABP(B|A) 131 ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) 1321 ………0 DFGP(G|D,F) 2131 ………0 ABCDFGBelief ………………0 IBP Example – Iteration 5 A ABABACAC ABDBCF DFG B B D F A A A C 1

IBP – Inference Power for Zero Beliefs Theorem: Iterative BP performs arc-consistency on the flat network. Soundness: Inference of zero beliefs by IBP converges All the inferred zero beliefs are correct Incompleteness: Iterative BP is as weak and as strong as arc-consistency Continuity Hypothesis: IBP is sound for zero -  IBP is accurate for extreme beliefs? Tested empirically

Experimental Results Algorithms: IBP IJGP Measures: Exact/IJGP histogram Recall absolute error Precision absolute error Network types: Coding Linkage analysis* Grids* Two-layer noisy-OR* CPCS54, CPCS360 We investigated empirically if the results for zero beliefs extend to ε-small beliefs (ε > 0) * Instances from the UAI08 competition Have determinism? YES NO

Networks with Determinism: Coding N=200, 1000 instances, w*=15

Nets w/o Determinism: bn2o w* = 24 w* = 27 w* = 26

Nets with Determinism: Linkage Percentage Absolute Error pedigree1, w* = 21 Exact Histogram IJGP Histogram Recall Abs. Error Precision Abs. Error Percentage Absolute Error pedigree37, w* = 30 i-bound = 3i-bound = 7

Nets with Determinism: Grids (size, % determinism) =, w* = 22

Nets w/o Determinism: cpcs CPCS360: 5 instances, w*=20 CPCS54: 100 instances, w*=15

The Cutset Phenomena & irrelevant nodes Observed variables break the flow of inference IBP is exact when evidence variables form a cycle-cutset Unobserved variables without observed descendents send zero- information to the parent variables – it is irrelevant In a network without evidence, IBP converges in one iteration top-down X X Y

Nodes with extreme support Observed variables with xtreme priors or xtreme support can nearly-cut information flow: D B1B1 EDED A B2B2 EBEB ECEC EBEB ECEC C1C1 C2C2 Average Error vs. Priors

Conclusion: For Networks with Determinism IBP converges & sound for zero beliefs IBP’s power to infer zeros is as weak or as strong as arc-consistency However: inference of extreme beliefs can be wrong. Cutset property (Bidyuk and Dechter, 2000): Evidence and inferred singleton act like cutset If zeros are cycle-cutset, all beliefs are exact Extensions to epsilon-cutset were supported empirically.

Inference on Trees is Easy and Distributed Belief updating (sum-prod) MPE (max-prod) CSP – consistency (projection-join) #CSP (sum-prod) P(X) P(Y|X)P(Z|X) P(T|Y) P(R|Y)P(L|Z)P(M|Z) Trees are processed in linear time and memory Also Acyclic graphical models

Inference on Poly-Trees is Easy and Distributed