Inference. Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference.

Slides:



Advertisements
Similar presentations
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Advertisements

CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 March, 25, 2015 Slide source: from Pedro Domingos UW.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
Generating Hard Satisfiability Problems1 Bart Selman, David Mitchell, Hector J. Levesque Presented by Xiaoxin Yin.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis Lecture)
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
Markov Logic: Combining Logic and Probability Parag Singla Dept. of Computer Science & Engineering Indian Institute of Technology Delhi.
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Speeding Up Inference in Markov Logic Networks by Preprocessing to Reduce the Size of the Resulting Grounded Network Jude Shavlik Sriraam Natarajan Computer.
Markov Networks.
Unifying Logical and Statistical AI Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok,
Markov Logic Networks Hao Wu Mariyam Khalid. Motivation.
SAT ∩ AI Henry Kautz University of Rochester. Outline Ancient History: Planning as Satisfiability The Future: Markov Logic.
Proof methods Proof methods divide into (roughly) two kinds: –Application of inference rules Legitimate (sound) generation of new sentences from old Proof.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Statistical Relational Learning Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Towards Efficient Sampling: Exploiting Random Walk Strategy Wei Wei, Jordan Erenrich, and Bart Selman.
Belief Propagation, Junction Trees, and Factor Graphs
1 Sampling, Counting, and Probabilistic Inference Wei joint work with Bart Selman.
Unifying Logical and Statistical AI Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Stanley Kok, Daniel Lowd,
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
Markov Logic: A Simple and Powerful Unification Of Logic and Probability Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Sampling Combinatorial Space Using Biased Random Walks Jordan Erenrich, Wei Wei and Bart Selman Dept. of Computer Science Cornell University.
SAT-solving An old AI technique becomes very popular in modern A.I.
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
Markov Logic: A Unifying Language for Information and Knowledge Management Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
Machine Learning For the Web: A Unified View Pedro Domingos Dept. of Computer Science & Eng. University of Washington Includes joint work with Stanley.
1 MCMC Style Sampling / Counting for SAT Can we extend SAT/CSP techniques to solve harder counting/sampling problems? Such an extension would lead us to.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
Markov Logic And other SRL Approaches
Markov Chain Monte Carlo and Gibbs Sampling Vasileios Hatzivassiloglou University of Texas at Dallas.
Survey Propagation. Outline Survey Propagation: an algorithm for satisfiability 1 – Warning Propagation – Belief Propagation – Survey Propagation Survey.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
First-Order Logic and Inductive Logic Programming.
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
1 Markov Logic Stanley Kok Dept. of Computer Science & Eng. University of Washington Joint work with Pedro Domingos, Daniel Lowd, Hoifung Poon, Matt Richardson,
CPSC 322, Lecture 31Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 25, 2015 Slide source: from Pedro Domingos UW & Markov.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Oct, 30, 2015 Slide credit: some slides adapted from Stuart.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 Nov, 23, 2015 Slide source: from Pedro Domingos UW.
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
1 Propositional Logic Limits The expressive power of propositional logic is limited. The assumption is that everything can be expressed by simple facts.
Markov Logic: A Representation Language for Natural Language Semantics Pedro Domingos Dept. Computer Science & Eng. University of Washington (Based on.
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
First Order Representations and Learning coming up later: scalability!
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Scalable Statistical Relational Learning for NLP William Y. Wang William W. Cohen Machine Learning Dept and Language Technologies Inst. joint work with:
New Rules for Domain Independent Lifted MAP Inference
Markov Logic Networks for NLP CSCI-GA.2591
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 29
First-Order Logic and Inductive Logic Programming
Markov chain monte carlo
Logic for Artificial Intelligence
Markov Networks.
Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005.
Lecture 15 Sampling.
Mostly pilfered from Pedro’s slides
Markov Networks.
Presentation transcript:

Inference

Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference

MCMC: Gibbs Sampling state ← random truth assignment for i ← 1 to num-samples do for each variable x sample x according to P(x|neighbors(x)) state ← state with new value of x P(F) ← fraction of states in which F is true

But … Insufficient for Logic Problem: Deterministic dependencies break MCMC Near-deterministic ones make it very slow Solution: Combine MCMC and WalkSAT → MC-SAT algorithm

The MC-SAT Algorithm MC-SAT = MCMC + SAT MCMC: Slice sampling with an auxiliary variable for each clause SAT: Wraps around SampleSAT (a uniform sampler) to sample from highly non-uniform distributions Sound: Satisfies ergodicity & detailed balance Efficient: Orders of magnitude faster than Gibbs and other MCMC algorithms

Auxiliary-Variable Methods Main ideas: Use auxiliary variables to capture dependencies Turn difficult sampling into uniform sampling Given distribution P(x) Sample from f (x,u), then discard u

Slice Sampling [Damien et al. 1999] X x (k) u (k) x (k+1) Slice U P(x)P(x)

Slice Sampling Identifying the slice may be difficult Introduce an auxiliary variable u i for each Ф i

The MC-SAT Algorithm Approximate inference for Markov logic Use slice sampling in MCMC Auxiliary var. u i for each clause C i : C i unsatisfied: 0  u i  1  exp ( w i f i ( x ) )  u i for any next state x C i satisfied: 0  u i  exp ( w i )  With prob. 1 – exp ( – w i ), next state x must satisfy C i to ensure that exp ( w i f i (x) )  u i

The MC-SAT Algorithm Select random subset M of satisfied clauses Larger w i  C i more likely to be selected Hard clause ( w i   ): Always selected Slice  States that satisfy clauses in M Sample uniformly from these

The MC-SAT Algorithm X ( 0 )  A random solution satisfying all hard clauses for k  1 to num_samples M  Ø forall C i satisfied by X ( k – 1 ) With prob. 1 – exp ( – w i ) add C i to M endfor X ( k )  A uniformly random solution satisfying M endfor

The MC-SAT Algorithm Sound: Satisfies ergodicity and detailed balance (Assuming we have a perfect uniform sampler) Approximately uniform sampler [Wei et al. 2004] SampleSAT = WalkSAT + Simulated annealing WalkSAT: Find a solution very efficiently But may be highly non-uniform Sim. Anneal.: Uniform sampling as temperature  0 But very slow to reach a solution Trade off uniformity vs. efficiency by tuning the prob. of WS steps vs. SA steps

Combinatorial Explosion Problem: If there are n constants and the highest clause arity is c, the ground network requires O(n ) memory (and inference time grows in proportion) Solutions: Knowledge-based model construction Lazy inference Lifted inference c

Knowledge-Based Model Construction Basic idea: Most of ground network may be unnecessary, because evidence renders query independent of it Assumption: Evidence is conjunction of ground atoms Knowledge-based model construction (KBMC): First construct minimum subset of network needed to answer query (generalization of KBMC) Then apply MC-SAT (or other)

Ground Network Construction network ← Ø queue ← query nodes repeat node ← front(queue) remove node from queue add node to network if node not in evidence then add neighbors(node) to queue until queue = Ø

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Example Grounding P( Cancer(B) | Smokes(A), Friends(A,B), Friends(B,A)) Cancer(A) Smokes(A)Friends(A,A) Friends(B,A) Smokes(B) Friends(A,B) Cancer(B) Friends(B,B)

Lazy Inference Most domains are extremely sparse Most ground atoms are false Therefore most clauses are trivially satisfied We can exploit this by Having a default state for atoms and clauses Grounding only those atoms and clauses with non-default states Typically reduces memory (and time) by many orders of magnitude

Example: Scientific Research Author(person,paper) Author(person1,paper)  Author(person2,paper)  Coauthor(person1,person2) 1000 Papers, 100 Authors 100,000 possible groundings … But only a few thousand are true 10 million possible groundings … But only tens of thousands are unsatisfied

Lazy Inference Here: LazySAT (lazy version of WalkSAT) Method is applicable to many other algorithms (including MC-SAT)

Naïve Approach Create the groundings and keep in memory True atoms Unsatisfied clauses Memory cost is O(# unsatisfied clauses) Problem Need to go to the KB for each flip Too slow! Solution Idea : Keep more things in memory A list of active atoms Potentially unsatisfied clauses (active clauses)

LazySAT: Definitions An atom is an Active Atom if It is in the initial set of active atoms It was flipped at some point during the search A clause is an Active Clause if It can be made unsatisfied by flipping zero or more active atoms in it

LazySAT: The Basics Activate all the atoms appearing in clauses unsatisfied by evidence DB Create the corresponding clauses Randomly assign truth value to all active atoms Activate an atom when it is flipped if not already so Potentially activate additional clauses No need to go to the KB for calculating the change in cost for flipping an active atom

LazySAT for i ← 1 to max-tries do active_atoms ← atoms in clauses unsatisfied by DB active_clauses ← clauses activated by active_atoms soln = random truth assignment to active_atoms for j ← 1 to max-flips do if ∑ weights(sat. clauses) ≥ threshold then return soln c ← random unsatisfied clause with probability p v f ← a randomly chosen variable from c else for each variable v in c do compute DeltaGain(v), using weighted_KB if v f  active_atoms v f ← v with highest DeltaGain(v) if v f  active_atoms then activate v f and add clauses activated by v f soln ← soln with v f flipped return failure, best soln found

LazySAT for i ← 1 to max-tries do active_atoms ← atoms in clauses unsatisfied by DB active_clauses ← clauses activated by active_atoms soln = random truth assignment to active_atoms for j ← 1 to max-flips do if ∑ weights(sat. clauses) ≥ threshold then return soln c ← random unsatisfied clause with probability p v f ← a randomly chosen variable from c else for each variable v in c do compute DeltaGain(v), using weighted_KB if v f  active_atoms v f ← v with highest DeltaGain(v) if v f  active_atoms then activate v f and add clauses activated by v f soln ← soln with v f flipped return failure, best soln found

LazySAT for i ← 1 to max-tries do active_atoms ← atoms in clauses unsatisfied by DB active_clauses ← clauses activated by active_atoms soln = random truth assignment to active_atoms for j ← 1 to max-flips do if ∑ weights(sat. clauses) ≥ threshold then return soln c ← random unsatisfied clause with probability p v f ← a randomly chosen variable from c else for each variable v in c do compute DeltaGain(v), using weighted_KB if v f  active_atoms v f ← v with highest DeltaGain(v) if v f  active_atoms then activate v f and add clauses activated by v f soln ← soln with v f flipped return failure, best soln found

LazySAT for i ← 1 to max-tries do active_atoms ← atoms in clauses unsatisfied by DB active_clauses ← clauses activated by active_atoms soln = random truth assignment to active_atoms for j ← 1 to max-flips do if ∑ weights(sat. clauses) ≥ threshold then return soln c ← random unsatisfied clause with probability p v f ← a randomly chosen variable from c else for each variable v in c do compute DeltaGain(v), using weighted_KB if v f  active_atoms v f ← v with highest DeltaGain(v) if v f  active_atoms then activate v f and add clauses activated by v f soln ← soln with v f flipped return failure, best soln found

Example Coa(A,B ) Coa(A,A ) Coa(A,C)... False Coa(C,A)  Coa(A,A)  Coa(C,A) Coa(A,B)  Coa(B,C)  Coa(A,C) Coa(C,B)  Coa(B,B)  Coa(C,B) Coa(C,A)  Coa(A,B)  Coa(C,B) Coa(C,B)  Coa(B,A)  Coa(C,A)

Example Coa(A,B) Coa(A,A) Coa(A,C)... False True False Coa(C,A)  Coa(A,A)  Coa(C,A) Coa(A,B)  Coa(B,C)  Coa(A,C) Coa(C,B)  Coa(B,B)  Coa(C,B) Coa(C,A)  Coa(A,B)  Coa(C,B) Coa(C,B)  Coa(B,A)  Coa(C,A)

Example Coa(A,B) Coa(A,A) Coa(A,C)... False True False Coa(C,A)  Coa(A,A)  Coa(C,A) Coa(A,B)  Coa(B,C)  Coa(A,C) Coa(C,B)  Coa(B,B)  Coa(C,B) Coa(C,A)  Coa(A,B)  Coa(C,B) Coa(C,B)  Coa(B,A)  Coa(C,A)

LazySAT Performance Solution quality Performs the same sequence of flips Same result as WalkSAT Memory cost O(# potentially unsatisfied clauses) Time cost Much lower initialization cost Cost of creating active clauses amortized over many flips

Lifted Inference We can do inference in first-order logic without grounding the KB (e.g.: resolution) Let’s do the same for inference in MLNs Group atoms and clauses into “indistinguishable” sets Do inference over those First approach: Lifted variable elimination (not practical) Here: Lifted belief propagation

Belief Propagation Nodes (x) Features (f)

Lifted Belief Propagation Nodes (x) Features (f)

Lifted Belief Propagation Nodes (x) Features (f)

Lifted Belief Propagation   ,  : Functions of edge counts Nodes (x) Features (f)

Lifted Belief Propagation Form lifted network composed of supernodes and superfeatures Supernode: Set of ground atoms that all send and receive same messages throughout BP Superfeature: Set of ground clauses that all send and receive same messages throughout BP Run belief propagation on lifted network Guaranteed to produce same results as ground BP Time and memory savings can be huge

Forming the Lifted Network 1. Form initial supernodes One per predicate and truth value (true, false, unknown) 2. Form superfeatures by doing joins of their supernodes 3. Form supernodes by projecting superfeatures down to their predicates Supernode = Groundings of a predicate with same number of projections from each superfeature 4. Repeat until convergence

Example Evidence: Smokes(Ana) Friends(Bob,Charles), Friends(Charles,Bob) N people in the domain

Example Evidence: Smokes(Ana) Friends(Bob,Charles), Friends(Charles,Bob) Smokes(Ana) Smokes(Bob) Smokes(Charles) Smokes(James) Smokes(Harry) … Intuitive Grouping :

Initialization Smokes(Ana) Smokes(X) X  Ana Friends(Bob, Charles) Friends(Charles, Bob) Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X  Charles …. Supernodes Superfeatures

Joining the Supernodes Smokes(Ana) Smokes(X) X  Ana Friends(Bob, Charles) Friends(Charles, Bob) Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X  Charles …. Smokes(Ana) Supernodes Superfeatures

Joining the Supernodes Smokes(Ana) Smokes(X) X  Ana Friends(Bob, Charles) Friends(Charles, Bob) Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X  Charles …. Smokes(Ana)  Friends(Ana, X) Supernodes Superfeatures

Joining the Supernodes Smokes(Ana) Smokes(X) X  Ana Friends(Bob, Charles) Friends(Charles, Bob) Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X  Charles …. Supernodes Superfeatures Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana

Joining the Supernodes Smokes(Ana) Smokes(X) X  Ana Friends(Bob, Charles) Friends(Charles, Bob) Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X  Charles …. Supernodes Superfeatures Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana

Joining the Supernodes Smokes(Ana) Smokes(X) X  Ana Friends(Bob, Charles) Friends(Charles, Bob) Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X  Charles …. Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Supernodes Superfeatures Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana

Joining the Supernodes Smokes(Ana) Smokes(X) X  Ana Friends(Bob, Charles) Friends(Charles, Bob) Friends(Ana, X) Friends(X, Ana) Friends(Bob,X) X  Charles …. Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob, X)  Smokes(X) X  Charles … Supernodes Superfeatures Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana

Projecting the Superfeatures Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … Supernodes Superfeatures

Projecting the Superfeatures Smokes(Ana) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … Supernodes Superfeatures

Projecting the Superfeatures Smokes(Ana) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … Supernodes Superfeatures Populate with projection counts

Projecting the Superfeatures Smokes(Ana) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … N-1 Supernodes Superfeatures

Projecting the Superfeatures Smokes(Ana) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … N-10 Supernodes Superfeatures

Projecting the Superfeatures Smokes(Ana) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … N-100 Supernodes Superfeatures

Projecting the Superfeatures Smokes(Ana) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … N-1000 Supernodes Superfeatures

Projecting the Superfeatures Smokes(Ana) Smokes(Bob) Smokes(Charles) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … N N-3 Supernodes Superfeatures

Projecting the Superfeatures Smokes(Ana) Smokes(Bob) Smokes(Charles) Smokes(Ana)  Friends(Ana, X)  Smokes(X) X  Ana Smokes(X)  Friends(X, Ana)  Smokes(Ana) X  Ana Smokes(Bob)  Friends(Bob, Charles)  Smokes(Charles) … Smokes(Bob)  Friends(Bob,X)  Smokes(X) X  Charles … Smokes(X) X  Ana, Bob, Charles N N-3 010N-2 Supernodes Superfeatures

Theorem There exists a unique minimal lifted network The lifted network construction algo. finds it BP on lifted network gives same result as on ground network

Representing Supernodes And Superfeatures List of tuples: Simple but inefficient Resolution-like: Use equality and inequality Form clusters (in progress)

Open Questions Can we do approximate KBMC/lazy/lifting? Can KBMC, lazy and lifted inference be combined? Can we have lifted inference over both probabilistic and deterministic dependencies? (Lifted MC-SAT?) Can we unify resolution and lifted BP? Can other inference algorithms be lifted?