Variable and Value Ordering for MPE Search Sajjad Siddiqi and Jinbo Huang.

Slides:

Advertisements

Similar presentations

Computing Minimum-cardinality Diagnoses by Model Relaxation Sajjad Siddiqi National University of Sciences and Technology (NUST) Islamabad, Pakistan.

Advertisements

“Using Weighted MAX-SAT Engines to Solve MPE” -- by James D. Park Shuo (Olivia) Yang.

Queries with Difference on Probabilistic Databases Sanjeev Khanna Sudeepa Roy Val Tannen University of Pennsylvania 1.

Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.

Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.

Branch and Bound Optimization In an exhaustive search, all possible trees in a search space are generated for comparison At each node, if the tree is optimal.

Dynamic Bayesian Networks (DBNs)

Lecture 10: Integer Programming & Branch-and-Bound

Progress in Linear Programming Based Branch-and-Bound Algorithms

Experiments We measured the times(s) and number of expanded nodes to previous heuristic using BFBnB. Dynamic Programming Intuition. All DAGs must have.

MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10.

Graduate Center/City University of New York University of Helsinki FINDING OPTIMAL BAYESIAN NETWORK STRUCTURES WITH CONSTRAINTS LEARNED FROM DATA Xiannian.

Introduction Combining two frameworks

1 Optimisation Although Constraint Logic Programming is somehow focussed in constraint satisfaction (closer to a “logical” view), constraint optimisation.

Unifying Local and Exhaustive Search John Hooker Carnegie Mellon University September 2005.

DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.

Computational Methods for Management and Economics Carla Gomes

Constraint Satisfaction Problems

Exploiting Correlated Attributes in Acquisitional Query Processing Amol Deshpande University of Maryland Joint work with Carlos Sam

Solving the Protein Threading Problem in Parallel Nocola Yanev, Rumen Andonov Indrajit Bhattacharya CMSC 838T Presentation.

Ant Colony Optimization Optimisation Methods. Overview.

1 Abstraction Refinement for Bounded Model Checking Anubhav Gupta, CMU Ofer Strichman, Technion Highly Jet Lagged.

MAE 552 – Heuristic Optimization Lecture 5 February 1, 2002.

. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.

Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.

DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.

. PGM 2002/3 – Tirgul6 Approximate Inference: Sampling.

1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.

1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos.

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.

Performing Bayesian Inference by Weighted Model Counting Tian Sang, Paul Beame, and Henry Kautz Department of Computer Science & Engineering University.

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.

Large-scale Hybrid Parallel SAT Solving Nishant Totla, Aditya Devarakonda, Sanjit Seshia.

Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,

Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,

Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.

Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.

Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.

Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Schreiber, Yevgeny. Value-Ordering Heuristics: Search Performance vs. Solution Diversity. In: D. Cohen (Ed.) CP 2010, LNCS 6308, pp Springer-

On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.

Divide and Conquer Optimization problem: z = max{cx : x  S}

Performance of Distributed Constraint Optimization Algorithms A.Gershman, T. Grinshpon, A. Meisels and R. Zivan Dept. of Computer Science Ben-Gurion University.

William Lam March 20, 2013 (with slides from the IJCAI-09 tutorial on Combinatorial Optimization for Graphical Models) Discrete Optimization via Branch.

Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.

Introduction to Information Retrieval Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning and Pandu Nayak Efficient.

Foundations of Constraint Processing, Spring 2009 Structure-Based Methods: An Introduction 1 Foundations of Constraint Processing CSCE421/821, Spring 2009.

Eliminating non- binary constraints Toby Walsh Cork Constraint Computation Center.

Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.

Domain Splitting, Local Search CPSC 322 – CSP 4 Textbook §4.6, §4.8 February 4, 2011.

1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:

Constraint Programming for the Diameter Constrained Minimum Spanning Tree Problem Thiago F. Noronha Celso C. Ribeiro Andréa C. Santos.

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,

Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.

Hybrid BDD and All-SAT Method for Model Checking

Solving MAP Exactly by Searching on Compiled Arithmetic Circuits

A New Algorithm for Computing Upper Bounds for Functional EmajSAT

Backtracking And Branch And Bound

Spatio-temporal Pattern Queries

A Distributed Bucket Elimination Algorithm

Queries with Difference on Probabilistic Databases

Extensions to BT Search Foundations of Constraint Processing

Extensions to BT Search Foundations of Constraint Processing

Extensions to BT Search Foundations of Constraint Processing

Extensions to BT Search Foundations of Constraint Processing

Chapter 5: General search strategies: Look-ahead

Extensions to BT Search Foundations of Constraint Processing

Variable Elimination Graphical Models – Carlos Guestrin

Presentation transcript:

Variable and Value Ordering for MPE Search Sajjad Siddiqi and Jinbo Huang

Most Probable Explanation (MPE) N : Bayesian network X : {X,Y,Z} e : {X=x} Y XZ XYZPr(X,Y,Z) xyz0.05 xyz0.3 xyz0.05 xyz0.1 xyz xyz0.2 xyz0.1 xyz

Most Probable Explanation (MPE) N : Bayesian network X : {X,Y,Z} e : {X=x} Y XZ XYZPr(X,Y,Z) xyz0.05 xyz0.3 xyz0.05 xyz0.1 xyz xyz0.2 xyz0.1 xyz max

Exact MPE by Inference Variable Elimination – Bucket Elimination Exponential in the treewidth of the elimination order. Compilation – Decomposable Negation Normal Form (DNNF) Exploits local structure so treewidth is not necessarily the limiting factor. Both methods can either run out of time or memory

Exact MPE by Searching X Y ZZ x y y zzzz

Depth-First Search – Exponential in the number of variables X. Depth-First Branch-and-Bound Search – Computes an upper bound on any extension to the current assignment. – Backtracks when upper bound >= current solution. – Reduces complexity of search.

Exact MPE by B-n-B Search X Y ZZ x y y zzzz if upper bound <= 0.2 current solution = 0.2

Computing Bounds: Mini- Buckets Ignores certain dependencies amongst variables: – New network is easier to solve. – Solution grows only in one direction. Splits a bucket into two or more mini- buckets. – Focuses on generating tighter bounds. Mini-buckets is a special case of node splitting.

Node Splitting Y1Y2 ^^ and are clones of Y: fully split (N) Y XZ Q R ^ Y XZ Q1Q1Q1Q1 R Y1Y1Y1Y1 Y2Y2Y2Y2 ^ (N`) split variables = {Q,Y} Q Q2Q2Q2Q2 ^^ splitting

Node Splitting e: an instantiation of variables X in N. e: a compatible assignment to their clones in N´ e.g. if e = {Y=y}, then e = {Y 1 =y, Y 2 =y} then MPE p (N, e) <=  MPE p (N´, e, e)  = total number of instantiations of clone variables    ^ ^

Computing Bounds: Node Splitting (Choi et. Al 2007). Split network is easier to solve, its MPE computes the bound. Search performed only over the ‘split variables’ instead of all. Focuses on good network relaxations trying to reduce the number of splits.

B-n-B Search for MPE Y QQ yy q MPE(N`,{X=x})  MPE(N`,{X=x, Y=y, Y 1 =y, Y 2 =y})  MPE(N`,{X=x, Y=y, Y 1 =y, Y 2 =y,Q=q, Q 1 =q, Q 2 =q})  = 4, for two split variables with binary domain bound exact solution ^^ ^^ ^ ^^^ bound bound

B-n-B Search for MPE Leaves of the search tree give candidate MPE solutions. Elsewhere we get upper bounds to prune the search. A branch gets pruned if bound <= current solution.

Choice of Variables to Split Reduce the number of split variables. – Heuristic based on the reduction in the size of jointree cliques and separators. Split enough variables to reduce the treewidth to a certain threshold (when the network is easy to solve).

Variable and Value Ordering Reduce search space using an efficient variable and value ordering. (Choi et al. 2007) do not address this and use a neutral heuristic. Several heuristics are analyzed and their powers combined to produce an effective heuristic. Scales up the technique.

Entropy-based Ordering Y QQ yy Pr(Y=y), Pr(Y=y) Pr(Q=q), Pr(Q=q) entropy(Y), entropy(Q) Computation Do the same for clones and get average probabilities: Pr (Y=y) = [Pr(Y=y)+Pr(Y 1 =y)+Pr(Y 2 =y)]/3 ^^

Entropy-based Ordering Y QQ yy Favor those instantiations that are more likely to be MPEs. Computation Prefer Y over Q if entropy(Y) < entropy(Q). Prefer Y=y over Y=y if Pr(Y=y) > Pr(Y=y) Static and Dynamic versions.

Entropy-based Ordering Probabilities computed using DNNF: – Evaluation and Differentiation of AC Experiments: – Static heuristic, significantly faster than the neutral. – Dynamic heuristic, generally too expensive to compute and slower.

Nogood Learning g = {X=x, Y=y, Z=z} is a nogood if MPE p (N’, g, g) <= current solution  X Y Z current solution=1.0 bound=1.5 bound=1.3 bound=1.2 bound=0.5 y x z let g’ = g \ {Y=y} & MPEp (N’, g’, g’) <= current solution then g = g’ 

Nogood-based Ordering Scores: S(X=x) = number of occurrences in nogoods S(X) = [S(X=x) + S(X=x)]/2 (binary variables) Dynamic Ordering: Prefer higher scores. Impractical: overhead of repeated bound computation during learning.

Score-based Ordering A more effective approach based on nogoods. Scores of variables/values tell how can a nogood be obtained quickly (backtrack early). X Y Z bound=1.5 bound=1.3 bound=1.2 bound=0.5 y x z S(X=x) += =0.2 S(Y=y) += =0.1 S(Z=z) += =0.7

Improved Heuristic Periodically reinitialize scores (focus on recent past). Use static entropy-based order as the initial order of variables/values.

Experimental Setup Intel Core Duo 2.4 GHz + AMD Athlon 64 X2 Dual Core Processor 4600+, both with 4 GB of RAM running Linux. A memory limit of 1 GB on each MPE query. C2D DNNF compiler [Darwiche, 2004; 2005]. Trivial seed of 0 as the initial MPE solution to start the search. Keep splitting the network variables until treewidth <= 10.

Comparing search spaces on grid networks

Comparing search time on grid networks

Comparing nogood learning and score-based DVO on grid networks

Results on grid networks, 25 queries per network

Random networks 20 queries per network

Networks for genetic linkage analysis, which are some of the hardest networks Only SC-DVO succeeded

Comparison with SamIam on grid networks

Comparison with (Marinescu & Dechter, 2007) on grid networks (SMBBF – Static mini-bucket best first) Parameter ‘ i=20 ’, where ‘ i ’ controls the size of the mini-bucket We tried a few cases from random and genetic linkage analysis networks which SMBBF could not solve (4 random networks of sizes 100, 110, 120, and 130 and pedigree13 from the genetic linkage analysis network).

Conclusion Novel and efficient heuristic for dynamic variable ordering for computing the MPE in Bayesian networks. A significant improvement in time and space over less sophisticated heuristics and other MPE tools. Many hard network instances solved for the first time.