Constraint Optimization And counting, and enumeration 275 class

Slides:



Advertisements
Similar presentations
Verification/constraints workshop, 2006 From AND/OR Search to AND/OR BDDs Rina Dechter Information and Computer Science, UC-Irvine, and Radcliffe Institue.
Advertisements

Tree Clustering for Constraint Networks 1 Chris Reeson Advanced Constraint Processing Fall 2009 By Rina Dechter & Judea Pearl Artificial Intelligence,
Constraint Satisfaction Problems
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
1 Finite Constraint Domains. 2 u Constraint satisfaction problems (CSP) u A backtracking solver u Node and arc consistency u Bounds consistency u Generalized.
ICS-271:Notes 5: 1 Lecture 5: Constraint Satisfaction Problems ICS 271 Fall 2008.
MBD and CSP Meir Kalech Partially based on slides of Jia You and Brian Williams.
Bucket Elimination: A Unifying Framework for Probabilistic Inference By: Rina Dechter Presented by: Gal Lavee.
IJCAI Russian Doll Search with Tree Decomposition Martí Sànchez, David Allouche, Simon de Givry, and Thomas Schiex INRA Toulouse, FRANCE.
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
MURI Progress Report, June 2001 Advances in Approximate and Hybrid Reasoning for Decision Making Under Uncertainty Rina Dechter UC- Irvine Collaborators:
Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.
Dynamic Bayesian Networks (DBNs)
MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10.
Introduction Combining two frameworks
Finding the m best solution using search. Optimization algorithms for Bayesian networks developed in our group ● Bucket Elimination [Dechter 1996] ● Bucket.
1 Exact Inference Algorithms Bucket-elimination and more COMPSCI 179, Spring 2010 Set 8: Rina Dechter (Reading: chapter 14, Russell and Norvig.
Anagh Lal Monday, April 14, Chapter 9 – Tree Decomposition Methods Anagh Lal CSCE Advanced Constraint Processing.
Global Approximate Inference Eran Segal Weizmann Institute.
1 Exact Inference Algorithms for Probabilistic Reasoning; COMPSCI 276 Fall 2007.
SampleSearch: A scheme that searches for Consistent Samples Vibhav Gogate and Rina Dechter University of California, Irvine USA.
M. HardojoFriday, February 14, 2003 Directional Consistency Dechter, Chapter 4 1.Section 4.4: Width vs. Local Consistency Width-1 problems: DAC Width-2.
Tutorial #9 by Ma’ayan Fishelson
General search strategies: Look-ahead Chapter 5 Chapter 5.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
AND/OR Search for Mixed Networks #CSP Robert Mateescu ICS280 Spring Current Topics in Graphical Models Professor Rina Dechter.
Knight’s Tour Distributed Problem Solving Knight’s Tour Yoav Kasorla Izhaq Shohat.
Inference in Gaussian and Hybrid Bayesian Networks ICS 275B.
1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Introduction to Job Shop Scheduling Problem Qianjun Xu Oct. 30, 2001.
Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.
Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",
Approximation Techniques bounded inference 275b. SP22 Mini-buckets: “local inference” The idea is similar to i-consistency: bound the size of recorded.
Wednesday, January 29, 2003CSCE Spring 2003 B.Y. Choueiry Directional Consistency Chapter 4.
An Introduction to Artificial Intelligence Lecture 5: Constraint Satisfaction Problems Ramin Halavati In which we see how treating.
Arc Consistency CPSC 322 – CSP 3 Textbook § 4.5 February 2, 2011.
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
Foundations of Constraint Processing, Spring 2009 Structure-Based Methods: An Introduction 1 Foundations of Constraint Processing CSCE421/821, Spring 2009.
1 Tutorial #9 by Ma’ayan Fishelson. 2 Bucket Elimination Algorithm An algorithm for performing inference in a Bayesian network. Similar algorithms can.
Modelling and Solving Configuration Problems on Business
Exact Inference 275b.
Structure-Based Methods Foundations of Constraint Processing
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Constraint Satisfaction Problems (CSPs)
RE-Tree: An Efficient Index Structure for Regular Expressions
Lecture 11: Tree Search © J. Christopher Beck 2008.
Constraint Optimization And counting, and enumeration 275 class
Irina Rish IBM T.J.Watson Research Center
Computer Science cpsc322, Lecture 14
Comparing Genetic Algorithm and Guided Local Search Methods
Exact Inference Continued
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Structure-Based Methods Foundations of Constraint Processing
Anytime Anyspace AND/OR Best-first Search for Bounding Marginal MAP
From AND/OR Search to AND/OR BDDs
Structure-Based Methods Foundations of Constraint Processing
Exact Inference ..
Structure-Based Methods Foundations of Constraint Processing
Backtracking search: look-back
Chapter 5: General search strategies: Look-ahead
Exact Inference Continued
Exploiting Tree-Decomposition in Search: The AND/OR Paradigm
ICS 353: Design and Analysis of Algorithms
Directional consistency Chapter 4
Structure-Based Methods Foundations of Constraint Processing
Constraint Satisfaction Problems
An Introduction Structure-Based Methods
Structure-Based Methods Foundations of Constraint Processing
Iterative Join Graph Propagation
Presentation transcript:

Constraint Optimization And counting, and enumeration 275 class Chapter 13 Constraint Optimization And counting, and enumeration 275 class

Outline Introduction Optimization tasks for graphical models Solving optimization problems with inference and search Inference Bucket elimination, dynamic programming Mini-bucket elimination Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces Hybrids of search and inference Cutset decomposition Super-bucket scheme

Constraint Satisfaction Example: map coloring Variables - countries (A,B,C,etc.) Values - colors (e.g., red, green, yellow) Constraints: A B red green red yellow green red green yellow yellow green yellow red C A B D E F G Task: consistency? Find a solution, all solutions, counting

Propositional Satisfiability  = {(¬C), (A v B v C), (¬A v B v E), (¬B v C v D)}.

Constraint Optimization Problems for Graphical Models Cost 1 2 3 5 f(A,B,D) has scope {A,B,D} A finite COP is defined by a set of discrete variables (X) with their domains of values (D) and a set of cost functions defined on subsets of variables. We also have a global cost function defined over the variables, which has to be optimized (that is minimized or maximized). For the remaining of the talk we will assume a global cost function defined by the sum of all the individual cost function and the objective is to minimize it. On the right hand side we have the graphical representation of a COP instance, where nodes correspond to variables and an edge connects any two nodes that appear in the same function.

Constraint Optimization Problems for Graphical Models Cost 1 2 3 5 f(A,B,D) has scope {A,B,D} Primal graph = Variables --> nodes Functions, Constraints - arcs f1(A,B,D) f2(D,F,G) f3(B,C,F) A B C D F A finite COP is defined by a set of discrete variables (X) with their domains of values (D) and a set of cost functions defined on subsets of variables. We also have a global cost function defined over the variables, which has to be optimized (that is minimized or maximized). For the remaining of the talk we will assume a global cost function defined by the sum of all the individual cost function and the objective is to minimize it. On the right hand side we have the graphical representation of a COP instance, where nodes correspond to variables and an edge connects any two nodes that appear in the same function. F(a,b,c,d,f,g)= f1(a,b,d)+f2(d,f,g)+f3(b,c,f) G

Constrained Optimization Example: power plant scheduling

Probabilistic Networks Smoking Bronchitis Cancer X-Ray Dyspnoea P(S) P(B|S) P(D|C,B) P(C|S) P(X|C,S) P(D|C,B) C B D=0 D=1 0.1 0.9 1 0.7 0.3 0.8 0.2 P(S,C,B,X,D) = P(S)· P(C|S)· P(B|S)· P(X|C,S)· P(D|C,B)

Outline Introduction Optimization tasks for graphical models Solving by inference and search Inference Bucket elimination, dynamic programming, tree-clustering, bucket-elimination Mini-bucket elimination, belief propagation Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces Hybrids of search and inference Cutset decomposition Super-bucket scheme

Computing MPE P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)= P(b|a)P(d|b,a)P(e|b,c) “Moral” graph A D E C B MPE= B C P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)= D E P(b|a)P(d|b,a)P(e|b,c) Radu, convert this to mpw P(a) P(c|a) Variable Elimination

P(b|a) P(d|b,a) P(e|b,c) Finding Algorithm elim-mpe (Dechter 1996) Non-serial Dynamic Programming (Bertele and Briochi, 1973) Elimination operator bucket B: P(a) P(c|a) P(b|a) P(d|b,a) P(e|b,c) bucket C: bucket D: bucket E: bucket A: e=0 B C D E A MPE

Generating the MPE-tuple C: E: P(b|a) P(d|b,a) P(e|b,c) B: D: A: P(a) P(c|a) e=0

P(b|a) P(d|b,a) P(e|b,c) Complexity Algorithm elim-mpe (Dechter 1996) Non-serial Dynamic Programming (Bertele and Briochi, 1973) Elimination operator bucket B: P(a) P(c|a) P(b|a) P(d|b,a) P(e|b,c) bucket C: bucket D: bucket E: bucket A: e=0 B C D E A exp(W*=4) ”induced width” (max clique size) MPE

Complexity of bucket elimination Bucket-elimination is time and space r = number of functions The effect of the ordering: constraint graph A D E C B B C D E A E D C B A Finding smallest induced-width is hard

Directional i-consistency B D C B E D C B E D C B E Adaptive d-path d-arc

Mini-bucket approximation: MPE task Split a bucket into mini-buckets =>bound complexity

Mini-Bucket Elimination maxB∏ maxB∏ Bucket B Bucket C Bucket D Bucket E Bucket A P(B|A) P(D|A,B) P(E|B,C) P(C|A) E = 0 P(A) P(A) hB (C,E) hB (A,D) A B C D E P(B|A) P(C|A) hC (A,E) hD (A) P(E|B,C) P(D|A,B) hE (A) MPE* is an upper bound on MPE --U Generating a solution yields a lower bound--L

MBE-MPE(i) Algorithm Approx-MPE (Dechter&Rish 1997) Input: i – max number of variables allowed in a mini-bucket Output: [lower bound (cost of a sub-optimal solution), upper bound] Example: approx-mpe(3) versus elim-mpe

Properties of MBE(i) Complexity: O(r exp(i)) time and O(exp(i)) space. Yields an upper-bound and a lower-bound. Accuracy: determined by upper/lower (U/L) bound. As i increases, both accuracy and complexity increase. Possible use of mini-bucket approximations: As anytime algorithms As heuristics in search Other tasks: similar mini-bucket approximations for: belief updating, MAP and MEU (Dechter and Rish, 1997)

Outline Introduction Optimization tasks for graphical models Solving by inference and search Inference Bucket elimination, dynamic programming Mini-bucket elimination Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces Hybrids of search and inference Cutset decomposition Super-bucket scheme

The Search Space Objective function: A B C D E F A B f1 2 1 4 A C f2 3 2 1 4 A C f2 3 1 A E f3 1 3 2 A F f4 2 1 B C f5 1 2 4 B D f6 4 1 2 B E f7 3 1 2 C D f8 1 4 E F f9 1 2 A E C B F D Objective function: C D F E B A 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Arc-cost is calculated based on cost components. The Search Space A B f1 2 1 4 A C f2 3 1 A E f3 1 3 2 A F f4 2 1 B C f5 1 2 4 B D f6 4 1 2 B E f7 3 1 2 C D f8 1 4 E F f9 1 2 A E C B F D C D F E B A 1 2 4 1 1 3 1 5 4 2 2 5 1 1 1 1 5 6 4 2 2 4 1 5 6 4 2 2 4 1 1 1 1 1 1 1 1 1 3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3 5 2 5 2 5 2 5 2 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Arc-cost is calculated based on cost components.

Value of node = minimal cost solution below it The Value Function A B f1 2 1 4 A C f2 3 1 A E f3 1 3 2 A F f4 2 1 B C f5 1 2 4 B D f6 4 1 2 B E f7 3 1 2 C D f8 1 4 E F f9 1 2 A E C B F D 5 C D F E B A 5 7 1 2 4 6 5 1 7 4 1 3 1 5 4 2 2 5 8 5 1 3 1 1 7 4 1 2 1 5 6 4 2 2 4 1 5 6 4 2 2 4 1 3 3 1 3 3 1 1 1 1 1 1 1 2 2 1 2 2 1 1 1 3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3 5 2 5 2 5 2 5 2 3 3 3 3 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Value of node = minimal cost solution below it

Value of node = minimal cost solution below it An Optimal Solution A B f1 2 1 4 A C f2 3 1 A E f3 1 3 2 A F f4 2 1 B C f5 1 2 4 B D f6 4 1 2 B E f7 3 1 2 C D f8 1 4 E F f9 1 2 A E C B F D 5 C D F E B A 5 7 1 2 4 6 5 1 7 4 1 3 1 5 4 2 2 5 8 5 1 3 1 1 7 4 1 2 1 5 6 4 2 2 4 1 5 6 4 2 2 4 1 3 3 1 3 3 1 1 1 1 1 1 1 2 2 1 2 2 1 1 1 3 5 3 5 3 5 3 5 1 3 1 3 1 3 1 3 5 2 5 2 5 2 5 2 3 3 3 3 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 3 2 2 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 2 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Value of node = minimal cost solution below it

Basic Heuristic Search Schemes Heuristic function f(x) computes a lower bound on the best extension of x and can be used to guide a heuristic search algorithm. We focus on 1.Branch and Bound Use heuristic function f(xp) to prune the depth-first search tree. Linear space 2.Best-First Search Always expand the node with the highest heuristic value f(xp). Needs lots of memory f  L L

Classic Branch-and-Bound Upper Bound UB Lower Bound LB g(n) LB(n) = g(n) + h(n) n Prune if LB(n) ≥ UB h(n) OR Search Tree

How to Generate Heuristics The principle of relaxed models Linear optimization for integer programs Mini-bucket elimination Bounded directional consistency ideas

Generating Heuristic for graphical models (Kask and Dechter, 1999) Given a cost function C(a,b,c,d,e) = f(a) • f(b,a) • f(c,a) • f(e,b,c) • P(d,b,a) Define an evaluation function over a partial assignment as the probability of it’s best extension E D A B 1 D f*(a,e,d) = minb,c f(a,b,c,d,e) = = f(a) • minb,c f(b,a) • P(c,a) • P(e,b,c) • P(d,a,b) = g(a,e,d) • H*(a,e,d)

Generating Heuristics (cont.) H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b) = minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]] <= minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]] = minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)] = hB(d,a) • hC(e,a) = H(a,e,d) f(a,e,d) = g(a,e,d) • H(a,e,d) <= f*(a,e,d) The heuristic function H is what is compiled during the preprocessing stage of the Mini-Bucket algorithm.

Generating Heuristics (cont.) H*(a,e,d) = minb,c f(b,a) • f(c,a) • f(e,b,c) • P(d,a,b) = minc [f(c,a) • minb [f(e,b,c) • f(b,a) • f(d,a,b)]] >= minc [f(c,a) • minb f(e,b,c) • minb [f(b,a) • f(d,a,b)]] = minb [f(b,a) • f(d,a,b)] • minc [f(c,a) • minb f(e,b,c)] = hB(d,a) • hC(e,a) = H(a,e,d) f(a,e,d) = g(a,e,d) • H(a,e,d) <= f*(a,e,d) The heuristic function H is what is compiled during the preprocessing stage of the Mini-Bucket algorithm.

Static MBE Heuristics Given a partial assignment xp, estimate the cost of the best extension to a full solution The evaluation function f(x^p) can be computed using function recorded by the Mini-Bucket scheme A B C D E Belief Network f(a,e,D))=g(a,e) · H(a,e,D ) B: P(E|B,C) P(D|A,B) P(B|A) A: E: D: C: P(C|A) hB(E,C) hB(D,A) hC(E,A) P(A) hE(A) hD(A) f(a,e,D) = P(a) · hB(D,a) · hC(e,a) g h – is admissible E D A B 1

Heuristics Properties MB Heuristic is monotone, admissible Retrieved in linear time IMPORTANT: Heuristic strength can vary by MB(i). Higher i-bound  more pre-processing  stronger heuristic  less search. Allows controlled trade-off between preprocessing and search

Experimental Methodology Algorithms BBMB(i) – Branch and Bound with MB(i) BBFB(i) - Best-First with MB(i) MBE(i) Test networks: Random Coding (Bayesian) CPCS (Bayesian) Random (CSP) Measures of performance Compare accuracy given a fixed amount of time - how close is the cost found to the optimal solution Compare trade-off performance as a function of time

Empirical Evaluation of mini-bucket heuristics, Bayesian networks, coding Random Coding, K=100, noise=0.32 Random Coding, K=100, noise=0.28

Max-CSP experiments (Kask and Dechter, 2000)

Dynamic MB Heuristics Rather than pre-compiling, the mini-bucket heuristics can be generated during search Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node in the search space (a partial assignment, along the given variable ordering)

Dynamic MB and MBTE Heuristics (Kask, Marinescu and Dechter, 2003) Rather than precompile compute the heuristics during search Dynamic MB: Dynamic mini-bucket heuristics use the Mini-Bucket algorithm to produce a bound for any node during search Dynamic MBTE: We can compute heuristics simultaneously for all un-instantiated variables using mini-bucket-tree elimination . MBTE is an approximation scheme defined over cluster-trees. It outputs multiple bounds for each variable and value extension at once.

Cluster Tree Elimination - example ABC 1 G E F C D B A BC BCDF 2 BF BEF 3 EF EFG 4

Mini-Clustering Motivation: The basic idea: Time and space complexity of Cluster Tree Elimination depend on the induced width w* of the problem When the induced width w* is big, CTE algorithm becomes infeasible The basic idea: Try to reduce the size of the cluster (the exponent); partition each cluster into mini-clusters with less variables Accuracy parameter i = maximum number of variables in a mini-cluster The idea was explored for variable elimination (Mini-Bucket)

Idea of Mini-Clustering Split a cluster into mini-clusters => bound complexity

Mini-Clustering - example ABC 1 BC BCDF 2 BF BEF 3 EF EFG 4

Mini Bucket Tree Elimination ABC 1 ABC 1 BC BC 2 BCDF 2 BCDF BF BF 3 BEF 3 BEF EF EF 4 EFG 4 EFG

Mini-Clustering Correctness and completeness: Algorithm MC(i) computes a bound (or an approximation) for each variable and each of its values. MBTE: when the clusters are buckets in BTE.

Branch and Bound w/ Mini-Buckets BB with static Mini-Bucket Heuristics (s-BBMB) Heuristic information is pre-compiled before search. Static variable ordering, prunes current variable BB with dynamic Mini-Bucket Heuristics (d-BBMB) Heuristic information is assembled during search. Static variable ordering, prunes current variable BB with dynamic Mini-Bucket-Tree Heuristics (BBBT) Heuristic information is assembled during search. Dynamic variable ordering, prunes all future variables

Empirical Evaluation Algorithms: Complete Incomplete Measures: BBBT Time Accuracy (% exact) #Backtracks Bit Error Rate (coding) Algorithms: Complete BBBT BBMB Incomplete DLM GLS SLS IJGP IBP (coding) Benchmarks: Coding networks Bayesian Network Repository Grid networks (N-by-N) Random noisy-OR networks Random networks

Average Accuracy and Time. 30 samples, 10 observations, 30 seconds Real World Benchmarks Average Accuracy and Time. 30 samples, 10 observations, 30 seconds

Empirical Results: Max-CSP Random Binary Problems: <N, K, C, T> N: number of variables K: domain size C: number of constraints T: Tightness Task: Max-CSP

BBBT(i) vs BBMB(i), N=100 i=2 i=3 i=4 i=5 i=6 i=7 BBBT(i) vs. BBMB(i).

Searching the Graph; caching goods B f1 2 1 4 A C f2 3 1 A E f3 1 3 2 A F f4 2 1 B C f5 1 2 4 B D f6 4 1 2 B E f7 3 1 2 C D f8 1 4 E F f9 1 2 A E C B F D 5 A context(A) = [A] 1 2 4 B context(B) = [AB] 1 1 3 1 5 4 2 2 5 C context(C) = [ABC] 1 1 1 1 6 4 4 1 6 4 4 D context(D) = [ABD] 5 2 2 5 2 2 1 1 1 1 E context(E) = [AE] 5 5 1 1 2 3 2 3 3 3 5 3 3 5 F context(F) = [F] 1 1 2 1 3 2 2 4 1

Searching the Graph; caching goods B f1 2 1 4 A C f2 3 1 A E f3 1 3 2 A F f4 2 1 B C f5 1 2 4 B D f6 4 1 2 B E f7 3 1 2 C D f8 1 4 E F f9 1 2 A E C B F D 5 A context(A) = [A] 5 7 1 2 4 B context(B) = [AB] 6 5 1 7 4 1 3 1 5 4 2 2 5 C context(C) = [ABC] 8 5 1 3 1 1 7 4 1 2 1 6 4 4 1 6 4 4 1 D context(D) = [ABD] 5 2 2 5 2 2 3 3 1 1 1 1 2 2 1 1 E context(E) = [AE] 5 5 1 1 2 2 3 3 3 3 5 3 3 5 F context(F) = [F] 2 1 1 1 2 1 3 2 2 4 1

Outline Introduction Optimization tasks for graphical models Solving by inference and search Inference Bucket elimination, dynamic programming Mini-bucket elimination, belief propagation Search Branch and bound and best-first Lower-bounding heuristics AND/OR search spaces Hybrids of search and inference Cutset decomposition Super-bucket scheme