MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10.

Slides:

Advertisements

Similar presentations

Verification/constraints workshop, 2006 From AND/OR Search to AND/OR BDDs Rina Dechter Information and Computer Science, UC-Irvine, and Radcliffe Institue.

Advertisements

Constraint Satisfaction Problems

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.

CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.

“Using Weighted MAX-SAT Engines to Solve MPE” -- by James D. Park Shuo (Olivia) Yang.

Bucket Elimination: A Unifying Framework for Probabilistic Inference By: Rina Dechter Presented by: Gal Lavee.

Max Cut Problem Daniel Natapov.

Statistical Methods in AI/ML Bucket elimination Vibhav Gogate.

Bayesian Networks Bucket Elimination Algorithm 主講人：虞台文大同大學資工所智慧型多媒體研究室.

CSCE 582 Computation of the Most Probable Explanation in Bayesian Networks using Bucket Elimination -Hareesh Lingareddy University of South Carolina.

Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.

Exact Inference in Bayes Nets

Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.

Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)

An Introduction to Variational Methods for Graphical Models.

Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

Sum of Subsets and Knapsack

Introduction Combining two frameworks

CS 188: Artificial Intelligence Fall 2009 Lecture 20: Particle Filtering 11/5/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.

CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.

Conditional Random Fields

Belief Propagation, Junction Trees, and Factor Graphs

CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.

CS 188: Artificial Intelligence Fall 2006 Lecture 17: Bayes Nets III 10/26/2006 Dan Klein – UC Berkeley.

AND/OR Search for Mixed Networks #CSP Robert Mateescu ICS280 Spring Current Topics in Graphical Models Professor Rina Dechter.

Active Learning for Probabilistic Models Lee Wee Sun Department of Computer Science National University of Singapore LARC-IMS Workshop.

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.

Approximation Algorithms Pages ADVANCED TOPICS IN COMPLEXITY THEORY.

Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.

Introduction to Bayesian Networks

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,

Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.

CS774. Markov Random Field : Theory and Application Lecture 02

Learning With Bayesian Networks Markus Kalisch ETH Zürich.

CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.

Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.

Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Variable and Value Ordering for MPE Search Sajjad Siddiqi and Jinbo Huang.

An Introduction to Variational Methods for Graphical Models

Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.

Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

CSCE350 Algorithms and Data Structure Lecture 21 Jianjun Hu Department of Computer Science and Engineering University of South Carolina

Daphne Koller Overview Conditional Probability Queries Probabilistic Graphical Models Inference.

Today Graphical Models Representing conditional dependence graphically

1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:

Daphne Koller Overview Maximum a posteriori (MAP) Probabilistic Graphical Models Inference.

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,

Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk

Lecture 7: Constrained Conditional Models

Learning Deep Generative Models by Ruslan Salakhutdinov

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Inference in Bayesian Networks

Solving MAP Exactly by Searching on Compiled Arithmetic Circuits

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

A New Algorithm for Computing Upper Bounds for Functional EmajSAT

Exact Inference Continued

Bucket Renormalization for Approximate Inference

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 K&F: 7 (overview of inference) K&F: 8.1, 8.2 (Variable Elimination) Structure Learning in BNs 3: (the good,

Markov Random Fields Presented by: Vladan Radosavljevic.

CS 188: Artificial Intelligence Fall 2008

Markov Networks.

Variable Elimination Graphical Models – Carlos Guestrin

Presentation transcript:

MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10

Recap: AND/OR Search C BD A 2 BD 0101 C 01 BDBD A 01 A 01 A 0 A 1 A 1 A 0 AND/OR tree AND/OR graph C B D A {0,1}{0,1,2} {0,1} 2 BD 0101 C 01 DD B 0 A 0 1 B 0 A 1 Pseudo tree Pairwise Markov network

Recap: Formula-based Probabilistic inference Convert the graphical model into weighted logic Perform AND/OR search in weighted logic with Boolean constraint propagation Delete all formulas which are valid or unsatisfiable In practice, complexity much lower than worst case complexity of exp(w*) where w* is the treewidth Formula-based Inference AND/OR search over formula assignments rather than variable assignments

What we will cover? MPE= most probable explanation The tuple with the highest probability in the joint distribution Pr(X|e) MAP=maximum a posteriori Given a subset of variables Y, the tuple with the highest probability in the distribution P(Y|e) Exact Algorithms Variable elimination DFS search Branch and Bound Search Approximations Upper bounds Local search

Running Example: Cheating in UTD CS Population Sex (S), Cheating (C), Tests (T1 and T2) and Agreement (A)

Most likely instantiations A person takes a test and the test administrator says The two tests agree (A = true) What is the most likely group that the individual belongs to? Query: Most likely instantiation of Sex and Cheating given evidence A = true Is this a MAP or an MPE problem? Answer: Sex=male and Cheating=no.

MPE is a special case of MAP Most likely instantiation of all variables given A=yes S=female, C=no, T 1 =negative and T 2 =negative MPE projected on to the MAP variables does not yield the correct answer. S=female, C=no is incorrect! S=male, C=no is correct! We will distinguish between MPE and MAP probabilities MPE and MAP instantiations

Bucket Elimination for MPE Same schematic algorithm as before Replace “elimination operator” by “maximization operator” SCValue maleyes0.05 maleno0.95 femaleyes0.01 femaleno0.99 CValue yes no Collect all instantiations that agree on all other variables except S Compute the maximum value =

Bucket Elimination for MPE Same schematic algorithm as before Replace “elimination operator” by “maximization operator” SCValue maleyes0.05 maleno0.95 femaleyes0.01 femaleno0.99 CValue yes0.05 no0.99 Collect all instantiations that agree on all other variables except S and return the maximum value among them. =

Bucket elimination: order (S, C, T1, T2) S C T1T1 T2T2 MPE probability

Bucket elimination: Recovering MPE tuple S C T1T1 T2T2 MPE probability

Bucket elimination: MPE vs PE (Z) Maximization vs summation Complexity: Same Time and Space exponential in the width (w) of the given order: O(n exp(w*+1))

BE and Hidden Markov models BE_MPE in the order S 1, S 2, S 3,...., is equivalent to the Viterbi algorithm BE_PE in the order in the order S 1, S 2, S 3,...., is equivalent to the Forward algorithm

OR search for MPE At leaf nodes compute probabilities by taking product of factors Select the path with the highest leaf probability

Branch and Bound Search Oracle gives you an upper bound on MPE Prune nodes which have smaller upper bound than the current MPE solution

16 Mini-Bucket Approximation: Idea Split a bucket into mini-buckets => bound complexity

Mini Bucket elimination: (max-size=3 vars) Upper bound on the MPE probability C S T1T1 T2T2

Mini-bucket (i-bounds) A parameter “i” which controls the size of (number of variables in) each mini-bucket Algorithm exponential in “i” : O(n exp(i)) Example i=2, quadratic i=3, cubed etc Higher the i-bound, better the upper bound In practice, can use i-bounds as high as

Branch and Bound Search Oracle = MBE (i) at each point. Prune nodes which have smaller upper bound than the current MPE solution

Computing MAP probabilities: Bucket Elimination Given MAP variables “M” Can compute the MAP probability using bucket elimination by first summing out all non-MAP variables, and then maximizing out MAP variables. By summing out non-MAP variables we are effectively computing the joint marginal Pr(M, e) in factored form. By maximizing out MAP variables M, we are effectively solving an MPE problem over the resulting marginal. The variable order used in BE_MAP is constrained as it requires MAP variables M to appear last in the order.

MAP and constrained width Treewidth = 2 MAP variables = {Y 1,..,Y n } Any order in which M variables come first has width greater than or equal to n BE_MPE linear and BE_MAP is exponential

MAP and constrained width

MAP by branch and bound search MAP can be solved using depth-first brand-and-bound search, just as we did for MPE. Algorithm BB_MAP resembles the one for computing MPE with two exceptions. Exception 1: The search space consists only of the MAP variables Exception 2: We use a version of MBE_MAP for computing the bounds Order all MAP variables after the non-MAP variables.

MAP by Local Search Given a network with n variables and an elimination order of width w Complexity: O(r nexp(w+1)) where “r” is the number of local search steps Start with an initial random instantiation of MAP variables Neighbors of the instantiation “m” are instantiations that result from changing the value of one variable in “m” Score for neighbor “m”: Pr(m,e) How to compute Pr(m,e)? Bucket elimination.

MAP: Local search algorithm

Recap Exact MPE and MAP Bucket elimination Branch and Bound Search Approximations Mini bucket elimination Branch and Bound Search Local Search