UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering A Comparison of Lauritzen- Spiegelhalter, Hugin, and Shenoy- Shafer Architectures.

Slides:

Advertisements

Similar presentations

Chapter 5: Tree Constructions

Advertisements

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.

CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.

Graphical Models - Inference - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP.

1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,

Lauritzen-Spiegelhalter Algorithm

Statistical Methods in AI/ML Bucket elimination Vibhav Gogate.

Exact Inference in Bayes Nets

Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.

Dynamic Bayesian Networks (DBNs)

B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree

Best-First Search: Agendas

Clique Trees Amr Ahmed October 23, Outline Clique Trees Representation Factorization Inference Relation with VE.

Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.

Hidden Markov Models M. Vijay Venkatesh. Outline Introduction Graphical Model Parameterization Inference Summary.

Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.

From Variable Elimination to Junction Trees

CSCI 121 Special Topics: Bayesian Networks Lecture #3: Multiply-Connected Graphs and the Junction Tree Algorithm.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

Bayesian network inference

April 2004Bayesian Networks in Educational Assessment - Section II Tutorial 12 Inference Based on ETS lecture.

Accessing Spatial Data

Belief Propagation, Junction Trees, and Factor Graphs

Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)

BCCS 2008/09: GM&CSS Lecture 6: Bayes(ian) Net(work)s and Probabilistic Expert Systems.

10/22  Homework 3 returned; solutions posted  Homework 4 socket opened  Project 3 assigned  Mid-term on Wednesday  (Optional) Review session Tuesday.

Exact Inference: Clique Trees

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Section 6.4.1: Probabilistic Inference and.

P2P Course, Structured systems 1 Skip Net (9/11/05)

PGM 2002/03 Tirgul5 Clique/Junction Tree Inference.

Distributed Constraint Optimization * some slides courtesy of P. Modi

CSC2535 Spring 2013 Lecture 2a: Inference in factor graphs Geoffrey Hinton.

Efficient Gathering of Correlated Data in Sensor Networks

Basic Laws Of Math x

1 Chapter 1 Analysis Basics. 2 Chapter Outline What is analysis? What to count and consider Mathematical background Rates of growth Tournament method.

Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.

C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.

Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.

Bayesian Macromodeling for Circuit Level QCA Design Saket Srivastava and Sanjukta Bhanja Department of Electrical Engineering University of South Florida,

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,

Solving Bayesian Decision Problems: Variable Elimination and Strong Junction Tree Methods Presented By: Jingsong Wang Scott Langevin May 8, 2009.

Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.

Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.

1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:

R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all.

Introduction on Graphic Models

Packet Classification Using Dynamically Generated Decision Trees

Today Graphical Models Representing conditional dependence graphically

Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo.

Huffman Coding (2 nd Method). Huffman coding (2 nd Method)  The Huffman code is a source code. Here word length of the code word approaches the fundamental.

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,

Inference in Bayesian Networks

Inconsistent Constraints

Spatial Online Sampling and Aggregation

CSCI 5822 Probabilistic Models of Human and Machine Learning

Pattern Recognition and Image Analysis

Class #19 – Tuesday, November 3

Class #16 – Tuesday, October 26

Variable Elimination 2 Clique Trees

Lecture 3: Exact Inference in GMs

Clique Tree Algorithm: Computation

Junction Trees 3 Undirected Graphical Models

Variable Elimination Graphical Models – Carlos Guestrin

Mean Field and Variational Methods Loopy Belief Propagation

Presentation transcript:

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering A Comparison of Lauritzen- Spiegelhalter, Hugin, and Shenoy- Shafer Architectures for Computing Marginals of Probability Distributions Scott Langevin Chris Streett Alicia Ruvinsky

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering History Computing Marginals of Multivariate Discrete probability distributions in Uncertain reasoning 1986 – Pearl’s architecture –Singly connected Bayes nets –Singly connected = (a.k.a. polytree) there exists at most one path between any 2 nodes 1988 – Lauritzen and Spiegelhalter create LS 1990 – Jensen et al. modify LS creating Hugin 1990 – Inspired by previous work, Shenoy and Shafer propose framework using join trees to produce marginals 1997 – Shenoy refines Shenoy-Shafer architecture with binary join trees –This will be referred to as the Shenoy-Shafer (SS) architecture

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Chest Clinic Problem Dyspnoea(D) may be caused by Tuberculosis(T), Lung Cancer(L) or Bronchitis(B) A recent visit to Asia(A) can increase the chance of T Smoking(S) increases chance of L and B X-ray(X) – Does not discriminate between L and T Either(E) Tuberculosis or Lung Cancer can result in a positive X-ray

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering 3 axioms enabling efficient local computation of marginals of the joint valuation: 1.Order of deletion does not matter –Suppose σ is a valuation for s, and suppose X 1, X 2 ε s. Then (s ↓ (s – {X1}) ) ↓ (s – {X1, X2}) = (s ↓ (s – {X2}) ) ↓ (s – {X1, X2}) 2.Commutativity and associativity of combination –Suppose ρ, σ, and τ are valuations for r, s, and t, respectively. Then ρ × σ = σ × ρ, and ρ × ( σ × τ ) = ( ρ × σ ) × τ. 3.Distributivity of marginalization over combination –Suppose ρ and σ are valuations for r and s, respectively, suppose X ε s, and suppose X ε r. Then ( ρ × σ ) ↓ ((rUs) – {X}) = ρ × ( σ ↓ (s – {X}) ) Axioms For Local Computation

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Lauritzen-Spiegelhalter Architecture – Junction Trees First construct junction tree for BN –Join tree where each node is a clique Associate each potential K v with smallest clique that contains {V} U Pa(V). If clique contains more than one potential associate cartesian product of potentials to clique Evidence is modeled as potentials and associated with smallest clique that includes domain of potential Pick a node with largest state space in junction tree to be root

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering LS Junction Tree for Chest Clinic Problem A T T L E E X S L B L E B E B D

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Lauritzen-Spiegelhalter Architecture - Calculating Marginals Two phases: Inward pass, Outward pass Involves sending messages which are potentials to neighboring nodes Inward pass –Each node sends message to inward neighbor after it receives messages from all outward neighbors. If no outward neighbors, send message. –When sending, message is computed by marginalizing current potential to intersection with inward neighbor. Message is sent to inward neighbor and current potential is divided by message –When receiving message from outward neighbor, current potential is multiplied by message –Inward pass ends when root has received a message from all outward neighbors

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Lauritzen-Spiegelhalter – Inward Pass cj ci Xi’ Xj Before cj ci Xi’’ = Xi’ / Xi’ ↓(c1 ∩ c2) Xj’ = Xj × Xi’ ↓(c1 ∩ c2) After

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Lauritzen-Spiegelhalter Architecture - Calculating Marginals Outward pass –Each node sends message to outward neighbors after it receives message from inward neighbor. If no inward neighbors, send message –When sending, message is computed by marginalizing current potential to intersection with outward neighbor. Message is sent to outward neighbor –When receiving message from inward neighbor, current potential is multiplied by message –Outward pass ends when all leaves have received messages from inward neighbors Final Step –At end of outward pass each clique is associated with potential representing marginal of the posterior for clique –To compute marginal of the posterior for each variable in BN find clique containing variable with the smallest domain and marginalize to compute marginal of variable

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Lauritzen-Spiegelhalter – Outward Pass cj ci Xi’’ Xj’’’ Before cj ci Xi’’’ = Xi’’ × Xj’’’ ↓(c1 ∩ c2) Xj’’’ After

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Hugin Architecture – Junction Trees + Separators Similar to LS method but with computational enhancement: Separators Construct Junction Tree as in LS but introduce a Separator node between each pair of cliques. Domain of Separator is intersection of two cliques. The Separator will store the potential of the intersection of the two cliques Pick any clique node to be root

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Hugin Junction Tree for Chest Clinic Problem A T T L E E X S L B L E B E B D T L B E L E E B

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Hugin Architecture - Calculating Marginals Two phases: Inward pass, Outward pass Involves sending messages which are potentials to neighboring nodes Inward Pass –Same as LS but sender does not divide current potential by message. Instead message is saved in separator Outward Pass –Separator divides message by saved potential and result is multiplied by potential of receiving node Final Step –At end of outward pass each clique and separator is associated with potential representing marginal of the posterior for domain of node –To compute marginal of the posterior for each variable in BN first find separator containing variable with the smallest domain and marginalize to compute marginal of variable. If no such separator exists then calculate marginal for variable as in LS by finding clique containing variable with the smallest domain and marginalize to compute marginal of variable

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Hugin – Inward Pass cj ci t Xi’ Xj Before cj ci Xi’’ ↓(ci ∩ cj) Xi’’ = Xi’ Xj’ = Xj × Xi’’ ↓(ci ∩ cj) After

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Hugin – Outward Pass cj ci Xi’’ ↓(ci ∩ cj) Xi’’ Xj’’’ Before cj ci Xj’’’ ↓(ci ∩ cj) / Xi’’ ↓(ci ∩ cj) Xi’’’ = Xi’’ × (Xj’’’ ↓(ci ∩ cj) / Xi’’ ↓(ci ∩ cj) ) Xj’’’ After

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Shenoy-Shafer Architecture – Binary Join Trees Setup: First, arrange elements of the hypergraph (generated from domains of potentials) into a binary join tree. –Binary join tree = join tree where no node has more than 3 neighbors All combinations done in pairs, i.e. combine functions 2 at a time –All singleton subsets appear in binary join tree (attached at node with smallest subset containing singleton variable to be attached) Associate each potential with a node containing subset that corresponds to its domain. (see figure)

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Binary Join Tree for Chest Clinic Problem

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Shenoy-Shafer Architecture – Calculating Marginals Each node that is to compute marginal requests a message from each of its neighbors –A node, r, receiving a message request from a neighbor, s, will in turn request messages from its other neighbors –Upon receiving messages from its other neighbors, r will combine all the messages it receives into its own potential. R will then marginalize this potential to r ∩ s. (note: leaves send reply right away) Message from r to s, formally: μ r → s = ( × { μ t → r | t ε (N(r) \ {s}) } × α r } ↓ r ∩ s μ r → s : message from r to s N(r) : neighbors of r α r : probability potential associated with node r When marginal computing node receives all replies, it computes marginal –It combines all messages together with its own probability potential and reports results as its marginal. – φ ↓ r = ( × { μ t → r | t ε (N(r)} × α r ( φ denotes joint potential)

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Important SS Storage Differences Architecture No division operations Input potentials remain unchanged during propagation process Marginal of the joint probability for a variable is computed at the corresponding singleton variable node of the binary tree sr αsαs φ↓rφ↓r αrαr φ↓sφ↓s μr→sμr→s μs→rμs→r

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Comparing LS and Hugin Hugin is more efficient than LS computationally. –Hugin has fewer additions and division. (equal on multiplications) –Computation of marginals is always done from the separator which exploits smaller domain sizes. –Marginals of single variables: in LS - find clique with smallest domain In Hugin - search thru separators as well as cliques Example: using figure 2, find P(T) and P(X) for each architecture LS is more storage efficient than Hugin. –LS doesn’t utilize separators In the interest of computational efficiency, comparison of Hugin and SS will be explored.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Comparing Hugin and SS SS is more computationally efficient than Hugin –SS has no divisions (on average, less computations) –Efficiency of SS increases with larger state spaces –Calculating probability of singleton variables adds expense in Hugin always calculated and available in SS. SS is more flexible than Hugin (as well as LS) –Due to lack of division CONJECTURE: Hugin is more storage efficient than SS. –To be investigated –SS has a larger data structure, hence most likely uses more space

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Comparison Data for Test Cases