SampleSearch: A scheme that searches for Consistent Samples Vibhav Gogate and Rina Dechter University of California, Irvine USA.

Slides:



Advertisements
Similar presentations
Bozhena Bidyuk Vibhav Gogate Rina Dechter
Advertisements

ICS-271:Notes 5: 1 Lecture 5: Constraint Satisfaction Problems ICS 271 Fall 2008.
“Using Weighted MAX-SAT Engines to Solve MPE” -- by James D. Park Shuo (Olivia) Yang.
A Simple Distribution- Free Approach to the Max k-Armed Bandit Problem Matthew Streeter and Stephen Smith Carnegie Mellon University.
Dynamic Bayesian Networks (DBNs)
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
From Variable Elimination to Junction Trees
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Overview Full Bayesian Learning MAP learning
For inference in Bayesian Networks Presented by Daniel Rembiszewski and Avishay Livne Based on Probabilistic Graphical Models: Principles And Techniques.
Constraint Satisfaction Problems. Constraint satisfaction problems (CSPs) Standard search problem: – State is a “black box” – any data structure that.
Bayesian network inference
1 Exact Inference Algorithms Bucket-elimination and more COMPSCI 179, Spring 2010 Set 8: Rina Dechter (Reading: chapter 14, Russell and Norvig.
CP Formal Models of Heavy-Tailed Behavior in Combinatorial Search Hubie Chen, Carla P. Gomes, and Bart Selman
An Approximation of Generalized Arc-Consistency for Temporal CSPs Lin Xu and Berthe Y. Choueiry Constraint Systems Laboratory Department of Computer Science.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
A New Efficient Algorithm for Solving the Simple Temporal Problem Lin Xu & Berthe Y. Choueiry Constraint Systems Laboratory University of Nebraska-Lincoln.
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
M. HardojoFriday, February 14, 2003 Directional Consistency Dechter, Chapter 4 1.Section 4.4: Width vs. Local Consistency Width-1 problems: DAC Width-2.
Stochastic greedy local search Chapter 7 ICS-275 Spring 2007.
10/22  Homework 3 returned; solutions posted  Homework 4 socket opened  Project 3 assigned  Mid-term on Wednesday  (Optional) Review session Tuesday.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
Announcements Homework 8 is out Final Contest (Optional)
On the Power of Belief Propagation: A Constraint Propagation Perspective Rina Dechter Bozhena Bidyuk Robert Mateescu Emma Rollon.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
. PGM 2002/3 – Tirgul6 Approximate Inference: Sampling.
Approximate Inference 2: Monte Carlo Markov Chain
Distributions of Randomized Backtrack Search Key Properties: I Erratic behavior of mean II Distributions have “heavy tails”.
Importance Sampling ICS 276 Fall 2007 Rina Dechter.
Mean Field Inference in Dependency Networks: An Empirical Study Daniel Lowd and Arash Shamaei University of Oregon.
1 Approximate Inference 2: Importance Sampling. (Unnormalized) Importance Sampling.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
SampleSearch: Importance Sampling in the presence of Determinism
CS 312: Algorithm Analysis Lecture #32: Intro. to State-Space Search This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported.
Non-Informative Dirichlet Score for learning Bayesian networks Maomi Ueno and Masaki Uto University of Electro-Communications, Japan 1.Introduction: Learning.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
U NIFORM S OLUTION S AMPLING U SING A C ONSTRAINT S OLVER A S AN O RACLE Stefano Ermon Cornell University August 16, 2012 Joint work with Carla P. Gomes.
CPSC 322, Lecture 28Slide 1 More on Construction and Compactness: Compact Conditional Distributions Once we have established the topology of a Bnet, we.
Probability and Measure September 2, Nonparametric Bayesian Fundamental Problem: Estimating Distribution from a collection of Data E. ( X a distribution-valued.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Bayes’ Nets: Sampling [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available.
Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.
Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",
BLOG: Probabilistic Models with Unknown Objects Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong, Andrey Kolobov University of.
Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Principles of Intelligent Systems Constraint Satisfaction + Local Search Written by Dr John Thornton School of IT, Griffith University Gold Coast.
Stochastic greedy local search Chapter 7 ICS-275 Spring 2009.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
CS 188: Artificial Intelligence Bayes Nets: Approximate Inference Instructor: Stuart Russell--- University of California, Berkeley.
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
Inference Algorithms for Bayes Networks
Advances in Bayesian Learning Learning and Inference in Bayesian Networks Irina Rish IBM T.J.Watson Research Center
CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.
Hybrid BDD and All-SAT Method for Model Checking
Consistency Methods for Temporal Reasoning
Approximate Inference
Regularized risk minimization
Approximate Inference Methods
Irina Rish IBM T.J.Watson Research Center
Chapter 5: General search strategies: Look-ahead
Approximate Inference: Particle-Based Methods
Directional consistency Chapter 4
Importance Sampling, Sequential Importance Sampling and more.
Presentation transcript:

SampleSearch: A scheme that searches for Consistent Samples Vibhav Gogate and Rina Dechter University of California, Irvine USA

Outline  Background Bayesian Networks with Zero probabilities Importance Sampling Rejection Problem  The SampleSearch Scheme Algorithm Sampling Distribution and its Approximation  Experimental Results

Bayesian Networks: Representation (Pearl, 1988) lung Cancer Smoking X-ray Bronchitis Dyspnoea P(D|C,B) P(B|S) P(S) P(X|C,S) P(C|S) P(S, C, B, X, D) = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) (A) Probability of Evidence P(smoking=no, dyspnoea=yes)=? (B) Belief Updating: P (lung cancer=yes | smoking=no, dyspnoea=yes ) = ?

Complexity  Belief Updating NP-hard when zeros are present General case when all CPTs are positive, not known. Relative Approximation  Randomized Polynomial time algorithm when all CPTs are positive (Dagum and Luby 1997)  Probability of Evidence NP-hard when zeros are present Relative Approximation  Randomized Polynomial time algorithm when all CPTs are positive and (1/P(e)) is polynomial (Karp, Dagum and Luby 1993)

Importance Sampling (Rubinstein ’81)

Importance Sampling for Belief Updating

Generating i.i.d. samples from Q Q(A,B,C)=Q(A)*Q(B|A)*Q(C|A,B) Q(A)=(0.8,0.2) Q(B|A)=(0.4,0.6,0.2,0.8) Q(C|A,B)=Q(C)=(0.2,0.8) A=0 B=0B=1B=0B=1 A=1 C=1 C=0 C=1 Root C=0

Rejection Problem  Importance Sampling requirement f(x i )>0 => Q(x i )>0 Conversely, Q(x i ) can be >0 even if f(x i )=0.  So if the probability of sampling ∑Q(x i |f(x i )>0) is very small A large number of assignments will have zero weight  Extreme case: Our approximation = zero.

Rejection Problem All Blue leaves correspond to solutions i.e. f(x) >0 All Red leaves correspond to non-solutions i.e. f(x)=0 A=0 B=0 C=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root

Example: map coloring Variables - countries (A,B,C,etc.) Values - colors (red, green, blue) Constraints: A Solution is an assignment that satisfies all constraints C A B D E F G Constraint Networks (Dechter 2003)

Constraint networks to model “zeros” A F C D B G ACP(C|A) Constraints A=0, C=0 not allowed A=1, C=1 not allowed Or A≠C Why constraints? For a partial sample if a constraint is violated f(X=x)=0 for any full extension X=x of the sample. For every full assignment X=x solution implies f(X=x) >0 and non-solution f(X=x)=0

Using Constraints A=0 B=0 C=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root Constraints A≠B, A≠C

Using Constraints A=0 B=0B=1 C=1 C=0 Root C=0 Constraints A≠B, A≠C Constraint A≠B violated

Outline  Background Bayesian Networks Bayesian Networks Importance Sampling Importance Sampling Rejection Prblem Rejection Prblem  The SampleSearch Scheme Algorithm Sampling Distribution and Approximation  Experimental Results

Algorithm SampleSearch A=0 B=0 C=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root Constraints A≠B, A≠C

Algorithm SampleSearch A=0 B=0 C=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root Constraints A≠B, A≠C 1

Algorithm SampleSearch A=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root Constraints A≠B, A≠C Resume Sampling 1

Algorithm SampleSearch A=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root Constraints A≠B, A≠C Constraint ViolatedUntil Solution i.e. f(x)>0 found 1

Generate more Samples A=0 B=0 C=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root Constraints A≠B, A≠C

Generate more Samples A=0 B=0 C=0 B=1B=0B=1 A=1 C=1 C=0 C=1 Root Constraints A≠B, A≠C 1

Traces of SampleSearch A=0 B=1 C=1 Root A=0 B=0 B=1 C=1 Root A=0 B=1 C=1 Root C=0 A=0 B=0 B=1 C=1 Root C=0 Constraints A≠B, A≠C

The Sampling distribution Q R of SampleSearch A=0 B=0B=1 C=1 C=0 Root C=0 What is probability of generating A=0? Q R (A=0)=0.8 Why? SampleSearch is systematic What is probability of generating B=1? Q R (B=1|A=0)=1 Why? SampleSearch is systematic What is probability of generating B=0? Simple: Q R (B=0|A=0)=0 All samples generated by SampleSearch are solutions  Did you generate samples from Q? -NO! Backtrack-free distribution

Computing Q R  Invoke an oracle or a complete search procedure O(n) times per sample A=0 B=1 C=1 Root ?? Solution

Approximation A R of Q R A=0 B=0B=1 C=1 C=0 Root C=0 Hole Don’t know No solutions here IF Hole THEN A R =Q IF No solutions on the other branch THEN A R =1

Approximation A R of Q R A=0 B=1 C=1 Root A=0 B=0 B=1 C=1 Root C=0  Problem: Can’t guarantee convergence ? ? A=0 B=0 B=1 C=1 Root 0.8 ? 1 A=0 B=1 C=1 Root C=0 0.8 ?

Guarantee convergence in the limit  Store all possible traces A=0 B=1 C=1 Root C=0 0.8 ? A=0 B=0 B=1 C=1 Root 0.8 ? 1 Approximation A R N IF Hole THEN A R N =Q IF No solutions on other branch THEN A R N =1 A=0 B=1 C=1 Root ?

Improving Naive SampleSeach  Handle Non-binary domains See the paper, Proof is complicated.  Better Search Strategy Can use any state-of-the-art CSP/SAT solver e.g. minisat (Sorrenson et al 2006)  All theorems and result hold  Better Importance Function Use output of generalized belief propagation to compute the initial importance function Q (Gogate and Dechter 2005)

Experimental Results  Previous Algorithms Likelihood weighting (LW)  Proposal=Prior IJGP-sampling (IJGP-S) (Gogate and Dechter 2005)  Proposal=Output of generalized belief propagation  Adding SampleSearch SampleSearch with LW (S+LW) SampleSearch with IJGP-sampling (S+IJGP-S)

Linkage BN_69

Linkage BN_73

Linkage BN_76

Conclusions Belief networks with zero probabilities lead to the Rejection problem in importance Sampling. We presented a SampleSearch scheme that works with any importance sampling scheme to circumvent the Rejection Problem. Sampling Distribution of SampleSearch is the backtrack- free distribution Q R –Expensive to compute –Approximation of Q R based on storing all traces that yields an asymptotically unbiased estimator Empirically, when a substantial number of zero probabilities are present, SampleSearch based schemes dominate their pure sampling counter-parts.