Exploiting Tree-Decomposition in Search: The AND/OR Paradigm

Slides:



Advertisements
Similar presentations
Verification/constraints workshop, 2006 From AND/OR Search to AND/OR BDDs Rina Dechter Information and Computer Science, UC-Irvine, and Radcliffe Institue.
Advertisements

Tree Clustering for Constraint Networks 1 Chris Reeson Advanced Constraint Processing Fall 2009 By Rina Dechter & Judea Pearl Artificial Intelligence,
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
ICS-271:Notes 5: 1 Lecture 5: Constraint Satisfaction Problems ICS 271 Fall 2008.
Lauritzen-Spiegelhalter Algorithm
Anagh Lal Tuesday, April 08, Chapter 9 – Tree Decomposition Methods- Part II Anagh Lal CSCE Advanced Constraint Processing.
MURI Progress Report, June 2001 Advances in Approximate and Hybrid Reasoning for Decision Making Under Uncertainty Rina Dechter UC- Irvine Collaborators:
Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.
Recursive Conditioning Adnan Darwiche Computer Science Department UCLA.
Probabilistic networks Inference and Other Problems Hans L. Bodlaender Utrecht University.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Introduction Combining two frameworks
1 Hybrid of search and inference: time- space tradeoffs chapter 10 ICS-275 Spring 2007.
SampleSearch: A scheme that searches for Consistent Samples Vibhav Gogate and Rina Dechter University of California, Irvine USA.
M. HardojoFriday, February 14, 2003 Directional Consistency Dechter, Chapter 4 1.Section 4.4: Width vs. Local Consistency Width-1 problems: DAC Width-2.
Stochastic greedy local search Chapter 7 ICS-275 Spring 2007.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
AND/OR Search for Mixed Networks #CSP Robert Mateescu ICS280 Spring Current Topics in Graphical Models Professor Rina Dechter.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.
Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",
Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View BDDs.
Stochastic greedy local search Chapter 7 ICS-275 Spring 2009.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
Foundations of Constraint Processing, Spring 2009 Structure-Based Methods: An Introduction 1 Foundations of Constraint Processing CSCE421/821, Spring 2009.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Exact Inference 275b.
Lecture 7: Constrained Conditional Models
Constraint Programming
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Computer Science cpsc322, Lecture 13
Structure-Based Methods Foundations of Constraint Processing
Consistency Methods for Temporal Reasoning
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
The minimum cost flow problem
Backtracking And Branch And Bound
Lecture 7 Constraint Satisfaction Problems
Constraint Optimization And counting, and enumeration 275 class
Constraint Optimization And counting, and enumeration 275 class
Irina Rish IBM T.J.Watson Research Center
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Constraint Propagation
Computer Science cpsc322, Lecture 13
Structure-Based Methods Foundations of Constraint Processing
Anytime Anyspace AND/OR Best-first Search for Bounding Marginal MAP
From AND/OR Search to AND/OR BDDs
Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 K&F: 7 (overview of inference) K&F: 8.1, 8.2 (Variable Elimination) Structure Learning in BNs 3: (the good,
CAP 5636 – Advanced Artificial Intelligence
Structure-Based Methods Foundations of Constraint Processing
Constraint Satisfaction Problems
Structure-Based Methods Foundations of Constraint Processing
Backtracking search: look-back
Chapter 11 Limitations of Algorithm Power
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
Chapter 5: General search strategies: Look-ahead
Advanced consistency methods Chapter 8
Context-specific independence
CS 188: Artificial Intelligence Spring 2006
Variable Elimination Graphical Models – Carlos Guestrin
Directional consistency Chapter 4
Structure-Based Methods Foundations of Constraint Processing
An Introduction Structure-Based Methods
Minimax strategies, alpha beta pruning
Structure-Based Methods Foundations of Constraint Processing
Iterative Join Graph Propagation
Consistency algorithms
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning
Presentation transcript:

Exploiting Tree-Decomposition in Search: The AND/OR Paradigm Rina Dechter Bren School of ICS University of California, Irvine We all know for quite sometime that graph parameters such as tree-width play a central role in query processing algorithm in graphical models. The algorithms that are tree-width sensitve are inference-type algorithms such as : varibale-elimination and tree –clustering. They all exploit the decomposition of The problem structure into a tree of subproblems. But so far tree-decomposition did not play the same role in search. Various improved methods can be interpreted as Being graph-structure sensitive but there is no unified approach. IN this talk I will present a unifying scheme for accounting for tree-decomposition in search that is applicable to any graphical model. Collaborators: Kalev Kask Radu Marinescu Robert Mateescu Vibhav Gogate

Outline Background: Search and inference in Graphical models AND/OR search trees Pseudo spanning trees, dfs trees AND/OR search graphs Tree-width and path-width bounds Relationship to algorithmic and compilation schemes. Vienna, Dec 2004

Constraint networks The main task: 1 2 3 A B 1 2 3 A A C 1 2 3 B C The main task: Determine if the problem has a solution (an assignment that satisfies all the relations); If yes, find one or all of them. D F B C F 1 2 3 G A B D 1 2 3 D F G 1 2 3 Vienna, Dec 2004

Probabilistic Networks P(D|C,B) P(B|S) P(S) P(X|C,S) P(C|S) Smoking lung Cancer Bronchitis CPD: C B D=0 D=1 0 0 0.1 0.9 0 1 0.7 0.3 1 0 0.8 0.2 1 1 0.9 0.1 X-ray Dyspnoea Belief networks provide a formalism for reasoning under uncertainty. A belief network is defined by a directed Acyclic graph over nodes representing variable of interest (e.g., temperature of a devise, the gender of a patient, His smoking habits and his illnesses and symptoms). The arcs signify the existence of direct influences between Linked variables, and the strength of those influences are quantified by conditional probabilities. P(S, C, B, X, D) = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) P(S|d)= ? MPE: argmax P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) Vienna, Dec 2004

Graphical models A D B C E F A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions (constraints, CPTS, cnfs) Path width Vienna, Dec 2004

Graphical models A D B C E F A graphical model (X,D,C): X = {X1,…Xn} variables D = {D1, … Dn} domains C = {F1,…,Ft} functions (constraints, CPTS, cnfs) Operators: combine and project: Path width MPE: maxX j Pj CSP: X j Cj Max-CSP: minX j Fj Belief updating: X-y j Pi All these tasks are NP-hard  identify special cases  approximate Vienna, Dec 2004

Solution Techniques Search: Conditioning Inference: Elimination Incomplete Simulated Annealing Complete Dfs search, Branch and bound, A* Gradient Descent Incomplete It is convenient to view solution algorithms for automated reasoning problems, as being of two primary types: search, like branch and bound search, or inference: like variable elimination, tree-clustering, resolution, etc. They have complementary computational properties: search is time exponential but linear space Inference is known to be time and space exponential in the tree-width of its graph. HYBRIDS: Much of the current research is focused on hybrides of search and inference. In fact, we can have a spectrum of paramererized search+inference algorithms, whose two extreme are pure search or pure inference. Non-systematic approximations can be distignuished along the same categories: they either approximate search, Or inference or a hybrid. In the past year, our research focused on a variety of approximation algorithm for inference and for search. Our main focus this year was presenting an alternative to the traditional search space for graphical model, the AND/OR search space. This notion is fundamental. It is applicable to any graphical model task, and immediately improve any traversal algorithm in exponential manner. Any algorithm that traverses the traditional search space is guaranteed to be inferior. Indeed, Many of the known advanced algorithms in constraint processing and probabilistic inference, can, in fact be viewed as traversing the AND/OR space. A well known recent example is recursive conditioning. Our framework provides a simple exposition for such algorithms, which otherwise look quite complex. Thus, if there is any one thing you want to carry from our presentation is that the OR search space for graphical model Is obsolete. No one should use it or view search through it from now on. Finally, when incorporating AND/OR search with caching along the hybrid of search and inference they reach the complexity of pure inference. Namely they are Exponential in the tree-width. This is indeed feasible only if we search the AND/OR space. If we search the OR space, No amount of caching of nodes can move us the tree-width complexity. The most we can get is path-width complexity. . Local Consistency Complete Unit Resolution mini-bucket(i) Adaptive Consistency Tree Clustering Dynamic Programming Resolution Inference: Elimination Vienna, Dec 2004

Solution Techniques Search: Conditioning Inference: Elimination Time: exp(n) Space: linear Search: Conditioning Incomplete Simulated Annealing Complete Gradient Descent Time: exp(w*) Space:exp(w*) Hybrids Incomplete Our main focus this year was presenting an alternative to the traditional search space for graphical model, the AND/OR search space. This notion is fundamental. It is applicable to any graphical model task, and immediately improve any traversal algorithm in exponential manner. Any algorithm that traverses the traditional search space is guaranteed to be inferior. Indeed, Many of the known advanced algorithms in constraint processing and probabilistic inference, can, in fact be viewed as traversing the AND/OR space. A well known recent example is recursive conditioning. Our framework provides a simple exposition for such algorithms, which otherwise look quite complex. Thus, if there is any one thing you want to carry from our presentation is that the OR search space for graphical model Is obsolete. No one should use it or view search through it from now on. Finally, when incorporating AND/OR search with caching along the hybrid of search and inference they reach the complexity of pure inference. Namely they are Exponential in the tree-width. This is indeed feasible only if we search the AND/OR space. If we search the OR space, No amount of caching of nodes can move us the tree-width complexity. The most we can get is path-width complexity. . Local Consistency Complete Unit Resolution mini-bucket(i) Adaptive Consistency Tree Clustering Dynamic Programming Resolution Inference: Elimination Vienna, Dec 2004

Solution Techniques AND/OR search Search: Conditioning Time: exp(w* log n) Space: linear Search: Conditioning Incomplete Simulated Annealing Complete Gradient Descent Time: exp(w*) Space:exp(w*) Hybrids: AND-OR(i) Incomplete Our main focus this year was presenting an alternative to the traditional search space for graphical model, the AND/OR search space. This notion is fundamental. It is applicable to any graphical model task, and immediately improve any traversal algorithm in exponential manner. Any algorithm that traverses the traditional search space is guaranteed to be inferior. Indeed, Many of the known advanced algorithms in constraint processing and probabilistic inference, can, in fact be viewed as traversing the AND/OR space. A well known recent example is recursive conditioning. Our framework provides a simple exposition for such algorithms, which otherwise look quite complex. Thus, if there is any one thing you want to carry from our presentation is that the OR search space for graphical model Is obsolete. No one should use it or view search through it from now on. Finally, when incorporating AND/OR search with caching along the hybrid of search and inference they reach the complexity of pure inference. Namely they are Exponential in the tree-width. This is indeed feasible only if we search the AND/OR space. If we search the OR space, No amount of caching of nodes can move us the tree-width complexity. The most we can get is path-width complexity. . Local Consistency Space: exp(i) Time: exp(C_i) Complete Unit Resolution mini-bucket(i) Adaptive Consistency Tree Clustering Dynamic Programming Resolution Inference: Elimination Vienna, Dec 2004

Bucket Elimination Adaptive Consistency (Dechter & Pearl, 1987) || RDCB || RACB || RAB RA E A D C B || RDB || RDBE , RCBE || RE Vienna, Dec 2004

Outline Background: Search and inference in Graphical models AND/OR search trees Pseudo spanning trees, dfs trees AND/OR search graphs Tree-width and path-width bounds Relationship to algorithmic and compilation schemes. Vienna, Dec 2004

OR search space Ordering: A B E C D F A F B C E D Vienna, Dec 2004 A B 1 E C F D B A 1 1 Defined by a variable ordering, Fixed but may be dynamic 1 1 1 Vienna, Dec 2004

AND/OR search space Primal graph DFS tree A D B C E F A A D B C E F A AND 1 B OR Backbone tree that will determine the structure of the and/or same as ordering determines for OR Captures independence Stop at level 6, independence between problems When full tree is generated, show again the primal graph. We used dfs tree to drive the and/or AND 1 E OR C AND 1 OR D F AND 1 Vienna, Dec 2004

OR vs AND/OR AND/OR OR Vienna, Dec 2004 A D B C E F A D B C E F E C F 1 AND/OR OR B B AND 1 1 OR E C E C E C E C AND 1 1 1 1 1 1 1 1 OR D F D F D F D F D F D F D F D F AND 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 C F D B A E 1 C F D B A Show the dfs tree for the and/or Try to make more clear. Maybe just show the shaded one, skip green and yellow. We capture the ability to skip the shaded trees Only orange nodes are important, the blue are bookkeepping. Overall the size is exponential in the depth of the dfs tree. Show dfs tree OR 1 1 1 Vienna, Dec 2004 1

AND/OR vs. OR AND/OR AND/OR size: exp(4), OR size exp(6) OR B C E F A D B C E F A OR AND 1 AND/OR OR B B AND 1 1 OR E C E C E C E C AND 1 1 1 1 1 1 1 1 OR D F D F D F D F D F D F D F D F AND 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 AND/OR size: exp(4), OR size exp(6) E 1 C F D B A E 1 C F D B A E 1 C F D B A Show the dfs tree for the and/or Try to make more clear. Maybe just show the shaded one, skip green and yellow. We capture the ability to skip the shaded trees Only orange nodes are important, the blue are bookkeepping. Overall the size is exponential in the depth of the dfs tree. Show dfs tree OR Vienna, Dec 2004

AND/OR vs. OR AND/OR OR Vienna, Dec 2004 No-goods (A=1,B=1) (B=0,C=0) F A D B C E F A OR AND 1 AND/OR OR B B AND 1 1 OR E C E C E C E C AND 1 1 1 1 1 1 1 1 OR D F D F D F D F D F D F D F D F AND 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 C F D B A E 1 C F D B A E 1 C F D B A Show the dfs tree for the and/or Try to make more clear. Maybe just show the shaded one, skip green and yellow. We capture the ability to skip the shaded trees Only orange nodes are important, the blue are bookkeepping. Overall the size is exponential in the depth of the dfs tree. Show dfs tree OR Vienna, Dec 2004

AND/OR vs. OR AND/OR OR Vienna, Dec 2004 (A=1,B=1) (B=0,C=0) A D B C E F A D B C E F OR A AND 1 AND/OR OR B B AND 1 OR E C E C E C AND 1 1 1 1 1 1 OR D F D F D F D F AND 1 1 1 1 1 1 1 1 A A A Show the dfs tree for the and/or Try to make more clear. Maybe just show the shaded one, skip green and yellow. We capture the ability to skip the shaded trees Only orange nodes are important, the blue are bookkeepping. Overall the size is exponential in the depth of the dfs tree. Show dfs tree 1 1 1 OR B B B 1 1 1 E E E 1 1 1 1 1 1 1 1 1 C C C 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 D D D 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Vienna, Dec 2004 F F F 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

OR space vs. AND/OR space width height OR space AND/OR space time(sec.) nodes backtracks AND nodes OR nodes 5 10 3.154 2,097,150 1,048,575 0.03 10,494 5,247 4 9 3.135 0.01 5,102 2,551 3.124 8,926 4,463 3.125 0.02 7,806 3,903 13 3.104 0.1 36,510 18,255 8,254 4,127 6 6,318 3,159 7,134 3,567 3.114 0.121 37,374 18,687 7,326 3,663 Vienna, Dec 2004

From DFS trees to pseudo-trees (Freuder 85, Bayardo 95) 1 4 1 6 2 3 2 7 5 3 (a) Graph 4 7 1 1 2 7 3 5 5 3 5 4 2 7 6 6 4 6 (b) DFS tree depth=3 (c) pseudo- tree depth=2 (d) Chain depth=6 Vienna, Dec 2004

From DFS trees to Pseudo-trees OR 1 AND a b c OR 2 7 AND a b c a b c OR 3 3 3 5 5 5 DFS tree depth = 3 AND a b c a b c a b c a b c a b c a b c OR 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 AND a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a b c a 1 3 4 b c OR AND 2 5 7 6 pseudo- tree depth = 2 Vienna, Dec 2004

Finding min-depth backbone trees Finding min depth DFS, or pseudo tree is NP-complete, but: Given a tree-decomposition whose tree-width is w*, there exists a pseudo tree T of G whose depth, satisfies (Bayardo and Mirankar, 1996, bodlaender and Gilbert, 91): m <= w* log n, Vienna, Dec 2004

Generating pseudo-trees from Bucket trees ABE A ABC AB BDE AEF bucket-A bucket-E bucket-B bucket-C bucket-D bucket-F B (AB) B C (AC) (BC) C F A C B D E E (AE) (BE) (AE) E D (BD) (DE) D F (AF) (EF) F d: A B C E D F Bucket-tree based on d Induced graph Bucket-tree E A C B D F AND 1 B OR C E D F A Bucket-tree used as pseudo-tree AND/OR search tree Vienna, Dec 2004

Pseudo Trees vs. DFS Trees Model (DAG) w* Pseudo Tree avg. depth DFS Tree avg. depth (N=50, P=2) 9.54 16.82 36.03 (N=50, P=3) 16.1 23.34 40.6 (N=50, P=4) 20.91 28.31 43.19 (N=100, P=2) 18.3 27.59 72.36 (N=100, P=3) 30.97 41.12 80.47 (N=100, P=4) 40.27 50.53 86.54 N = number of nodes, P = number of parents. MIN-FILL ordering. 100 instances. Vienna, Dec 2004

AND/OR search tree for graphical models The AND/OR search tree of R relative to a spanning-tree, T, has: Alternating levels of: OR nodes (variables) and AND nodes (values) Successor function: The successors of OR nodes X are all its consistent values along its path The successors of AND <X,v> are all X child variables in T A solution is a consistent subtree Task: compute the value of the root node A D B C E F A D B C E F A B E OR AND C D 1 F And/or is well known in the literature. Here is our specific def for graphical models Define and/or for graphical models. Make it clear that we have a definition. Add a slide for cost of solution. Different task define different costs Make all specific for graphical models Move this after pseudo-trees. Make title And/or search tree for graphical models. Highlight a solution. For various tasks we associate different value functions, and want to compute the value of the root. Then have a slide about all the tasks. Linear space and w* log n. similar to RC and others Vienna, Dec 2004

AND/OR Search-tree properties (k = domain size, m = pseudo-tree depth AND/OR Search-tree properties (k = domain size, m = pseudo-tree depth. n = number of variables) Theorem: Any AND/OR search tree based on a pseudo-tree is sound and complete (expresses all and only solutions) Theorem: Size of AND/OR search tree is O(n km) Size of OR search tree is O(kn) Theorem: Size of AND/OR search tree can be bounded by O(exp(w* log n)) Related to: (Freuder 85; Dechter 90, Bayardo et. al. 96, Darwiche 1999, Bacchus 2003) When the pseudo-tree is a chain we get an OR space Vienna, Dec 2004

Tasks and value of nodes V( n) is the value of the tree T(n) for the task: Consistency: v(n) is 0 if T(n) inconsistent, 1 othewise. Counting: v(n) is number of solutions in T(n) Optimization: v(n) is the optimal solution in T(n) Belief updating: v(n), probability of evidence in T(n). Partition function: v(n) is the total probability in T(n). Goal: compute the value of the root node recursively using dfs search of the AND/OR tree. Theorem: Complexity of AO dfs search is Space: O(n) Time: O(n km) Time: O(exp(w* log n)) Vienna, Dec 2004

DFS algorithm (#CSP example) B C E F A D B C E F solution OR AND B 1 E C D F 2 4 5 6 11 A B 4 OR node: Marginalization operator (summation) E 2 C 2 AND node: Combination operator (product) Value of each node is the number of solutions below it … If we cache, we search the graph, so all the algorithms have these time and space complexities. Add another slide Linear space Full caching + anything in between 1 2 1 1 1 D 2 F 1 D 2 F 1 1 1 1 1 1 1 1 1 Value of node = number of solutions below it Vienna, Dec 2004

From Search Trees to Search Graphs Any two nodes that root identical subtrees/subgraphs can be merged Minimal AND/OR search graph: closure under merge of the AND/OR search tree Inconsistent subtrees can be pruned too. Some portions can be collapsed or reduced. We may have two nodes that are identical, then we can merge. Highlight: rooting the same tree. Maybe an example of and/or tree, show animation that we merge. Minimal is when we remove all redundancy Vienna, Dec 2004

AND/OR Tree Vienna, Dec 2004 A J A F B B C K E C E D D F G G H J H K 1 OR B B AND 1 1 OR E C E C E C E C AND 1 1 1 1 1 1 1 1 OR D F D F D F D F D F D F D F D F G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K AND OR AND OR AND Vienna, Dec 2004

An AND/OR graph Vienna, Dec 2004 A J A F B B C K E C E D D F G G H J H 1 OR B B AND 1 1 OR E C E C E C E C AND 1 1 1 1 1 1 1 1 OR D F D F D F D F D F D F D F D F =============================================================== More information without talking about it? Do it on the original graph. Reason by looking at the induced graph and talk about independency. Context is dependent on separator set, which is bounded by w*, which yields the theorem We don’t need to generate the tree and then merge, but can do it while we search. We can generate minimal graph in time linear in its size (linear in the output) Context notion provides an algorithm which is linear in the size of the output. AND 1 1 OR G G J J AND 1 1 1 1 OR H H H H K K K K AND Vienna, Dec 2004 1 1 1 1 1 1 1 1

Context based caching Caching is possible when context is the same context = parent-separator set in induced pseudo-graph = current variable + parents connected to subtree below A D B C E F G H J K context(B) = {A, B} context(c) = {A,B,C} context(D) = {D} context(F) = {F} Vienna, Dec 2004

Induced-width of pseudo-trees The induced-width of a pseudo-tree is its induced-width along a dfs order that includes pseudo arcs. For pseudo-chains induced-width is path-width (yielding path-decomposition) Contexts are bounded by the induced-width of the pseudo tree. Min induced-pseudo-width=tree-width E A C B D F E A C B D F E A C B D F F A C B D E Good pseudo-tree Bad pseudo-tree A graph DFS order of both pseudo trees: d = A B C E D F Vienna, Dec 2004

Induced-width of pseudo-trees The induced-width of a pseudo-tree is its induced-width along a dfs order that includes pseudo arcs. For pseudo-chains induced-width is path-width (yielding path-decomposition) Contexts are bounded by the induced-width of the pseudo tree. Min induced-pseudo-width=tree-width E A C B D F E A C B D F E A C B D F F A C B D E Good pseudo-tree Bad pseudo-tree A graph DFS order of both pseudo trees: d = A B C E D F Vienna, Dec 2004

Caching context(D)={D} context(F)={F} Vienna, Dec 2004 A J A F B B C K OR A H K AND 1 OR B B AND 1 1 OR E C E C E C E C AND 1 1 1 1 1 1 1 1 OR D F D F D F D F D F D F D F D F G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K G H 1 1 G H J K 1 1 J K AND OR AND OR AND Vienna, Dec 2004

Caching context(D)={D} context(F)={F} Vienna, Dec 2004 A J A F B B C K OR A H K AND 1 OR B B AND 1 1 OR E C E C E C E C AND 1 1 1 1 1 1 1 1 OR D F D F D F D F D F D F D F D F =============================================================== More information without talking about it? Do it on the original graph. Reason by looking at the induced graph and talk about independency. Context is dependent on separator set, which is bounded by w*, which yields the theorem We don’t need to generate the tree and then merge, but can do it while we search. We can generate minimal graph in time linear in its size (linear in the output) Context notion provides an algorithm which is linear in the size of the output. AND 1 1 OR G G J J AND 1 1 1 1 OR H H H H K K K K AND Vienna, Dec 2004 1 1 1 1 1 1 1 1

Size of minimal AND/OR context graphs Theorem: Minimal AND/OR context graph is bounded exponentially by its pseudo-tree induced-width. The tree-width of a pseudo-chain is path-width (pw)  Minimal OR search graph is O(exp(pw*)).  Minimal AND/OR graph is O(exp(w*)) Always, w* ≤ pw*, but pw*<= w* log n Vienna, Dec 2004

OR tree vs. OR graph OR tree OR graph Vienna, Dec 2004 E C F D B A E C 1 C F D B A OR tree E 1 C F D B A 1 E C F D B A E 1 C F D B A E 1 C F D B A E 1 C F D B A E 1 C F D B A E 1 C F D B A OR graph Vienna, Dec 2004

AND/OR tree vs. AND/OR graph AND 1 B E C D F AND/OR tree A OR AND 1 B E C D F A OR AND 1 B E C D F A OR AND 1 B E C D F A OR AND 1 B E C D F AND/OR graph Vienna, Dec 2004

All four search spaces Vienna, Dec 2004 A A B B E E C C D D F F 1 1 E C F D B A Full OR search tree Context minimal OR search graph AND 1 B OR E C D F A OR AND 1 B E C D F Vienna, Dec 2004 Full AND/OR search tree Context minimal AND/OR search graph

All four search spaces Vienna, Dec 2004 A A B B E E C C D D F F 1 1 E C F D B A Full OR search tree Context minimal OR search graph Time-space AND 1 B OR E C D F A OR AND 1 B E C D F Vienna, Dec 2004 Full AND/OR search tree Context minimal AND/OR search graph

AND/OR vs OR Graphs Theorem: for balanced w-trees w*<=pw*<=m*<=w*log n (Bodlaendar et. Al, 1991) Theorem: for balanced w-trees 1) and-orTree-size = orGraph-size 2) Vienna, Dec 2004

Uniqueness of minimal AND/OR graph Theorem: Given a pseudo-tree, the minimal AND/OR search graph is unique for all graph-models that are consistent with that pseudo tree. Related to compilation schemes: Minimal OR – related to OBDDs (McMillan) Minimal AND/OR – related to tree-OBDDs (McMillan 94), AND/OR graphs related to d-DNNF (Darwiche et. Al. 2002) Vienna, Dec 2004

Searching AND/OR Graphs AO(i): searches depth-first, cache i-context i = the max size of a cache table (i.e. number of variables in a context) i=0 i=w* Space: O(n) Time: O(exp(w* log n)) Space: O(exp w*) Time: O(exp w*) AO(i) time complexity? Vienna, Dec 2004

AND/OR w-cutset 3-cutset 2-cutset 1-cutset Vienna, Dec 2004 A C B K G L D F H M J E A C B K G L D F H M J E 3-cutset A C B K G L D F H M J E 2-cutset A C B K G L D F H M J E 1-cutset Vienna, Dec 2004

AND/OR w-cutset grahpical model pseudo tree 1-cutset tree B K G L D F H M J E A C B K G L D F H M J E A C B K G L D F H M J E grahpical model pseudo tree 1-cutset tree Vienna, Dec 2004

Searching AND/OR Graphs AO(i): searches depth-first, cache i-context i = the max size of a cache table (i.e. number of variables in a context) i=0 i i=w* Space: O(n) Time: O(exp(w* log n)) Space: O(exp w*) Time: O(exp w*) Space: O(exp(i) ) Time: O(exp(m_i+i ) Vienna, Dec 2004

Overview Introduction and background for graphical models: inference and search Inference: Exact: Tree-clustering, variable elimination, Approximate: mini-bucket, belief propagation AND/OR search spaces for graphical models mixed networks Empirical evaluation over mixed networks, counting Vienna, Dec 2004

Mixed Networks Belief Network Constraint Network Moral mixed graph A A 1 F B C D=0 D=1 .2 .8 1 .1 .9 .3 .7 .5 B C We often have both probabilistic information and constraints. Our approach is to allow the two representation to co-exist, explicitlly. It has virtues for user interface and for computation. E D Vienna, Dec 2004

Using look-ahead – example (1) B C E F G H I K A D B C E F G H I K > All domains are {1,2,3,4} Vienna, Dec 2004

Using look-ahead CONSTRAINTS ONLY FORWARD CHECKING B C E F G H I K > 1 2 A C 3 4 B E D H G I F K Domains are {1,2,3,4} CONSTRAINTS ONLY 1 2 A C 3 B E D H G 4 I F K FORWARD CHECKING 1 A C B 2 E D 3 H 4 I F K G MAINTAINING ARC CONSISTENCY Vienna, Dec 2004

Experiments – mixed networks Parameters of the mixed networks: N = number of variables K = number of values per variable Belief network - (N,K,R,P): R = number of root nodes P = number of parents Measures: Time Number of nodes Number of dead-ends Constraint network - (N,K,C,S,t): C = number of constraints S = scope size of the constraints t = tightness (no. of disallowed tuples) Algorithms: AO-C constraint checking only AO-FC forward checking AO-RFC relational forward checking OR vs. AND/OR Spaces Vienna, Dec 2004

#CSP N40, K3, C50, P3, 20 inst, w*=13, depth=20 Time (seconds) AND/OR search is orders of magnitudes faster than OR search when problems are loose (consistent) AND/OR with full caching equivalent to BE: time and space O(exp w*) Vienna, Dec 2004

Mixed networks results (1) Caching helps more on loose problems Constraint propagation helps more on tight problems Vienna, Dec 2004

Conclusion AND/OR search spaces are a unifying framework for search or compilation applicable to any graphical models. With caching AND/OR is similar to inference (minimal graphs) AND/OR time and space bounds are equal to state of the art algorithms Empirical results AND/OR search spaces are always more effective than traditional OR spaces AND/OR allows a flexible tradeoff between space and time Graphical models should always use AND/OR search with embeded inference. Current work: Hybrid of inference and search: Heuristic generation and Branch and Bound Vienna, Dec 2004