Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005.

Slides:



Advertisements
Similar presentations
1 MPE and Partial Inversion in Lifted Probabilistic Variable Elimination Rodrigo de Salvo Braz University of Illinois at Urbana-Champaign with Eyal Amir.
Advertisements

Online Multi-camera Tracking with a Switiching State-Space Model Wojciech Zajdel, A. Taylan Cemgil, and Ben KrÄose ICPR 2004.
The Logic of Quantified Statements
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
1 MPE and Partial Inversion in Lifted Probabilistic Variable Elimination Rodrigo de Salvo Braz University of Illinois at Urbana-Champaign with Eyal Amir.
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz, Eyal Amir and Dan Roth This research is supported by ARDA’s AQUAINT Program, by NSF grant.
Markov Networks.
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
Learning Markov Network Structure with Decision Trees Daniel Lowd University of Oregon Jesse Davis Katholieke Universiteit Leuven Joint work with:
Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.
KI2 - 2 Kunstmatige Intelligentie / RuG Probabilities Revisited AIMA, Chapter 13.
Inference. Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference.
Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
Linear Systems The definition of a linear equation given in Chapter 1 can be extended to more variables; any equation of the form for real numbers.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Boosting Markov Logic Networks
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
1 MPE and Partial Inversion in Lifted Probabilistic Variable Elimination Rodrigo de Salvo Braz University of Illinois at Urbana-Champaign with Eyal Amir.
Markov Logic And other SRL Approaches
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G 박민규.
Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule Jim Little Uncertainty 2 Nov 3, 2014 Textbook §6.1.3.
Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz SRI International joint work with Eyal Amir and Dan Roth.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
1 Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz University of Illinois at Urbana-Champaign with Eyal Amir and Dan Roth.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
CPSC 322, Lecture 31Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 25, 2015 Slide source: from Pedro Domingos UW & Markov.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
CPSC 422, Lecture 32Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 32 Nov, 27, 2015 Slide source: from Pedro Domingos UW & Markov.
Inference Algorithms for Bayes Networks
CSE 473: Artificial Intelligence Autumn 2011 Bayesian Networks: Inference Luke Zettlemoyer Many slides over the course adapted from either Dan Klein, Stuart.
Daphne Koller Overview Conditional Probability Queries Probabilistic Graphical Models Inference.
Mathematics for Comter I Lecture 3: Logic (2) Propositional Equivalences Predicates and Quantifiers.
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
Scalable Statistical Relational Learning for NLP William Y. Wang William W. Cohen Machine Learning Dept and Language Technologies Inst. joint work with:
New Rules for Domain Independent Lifted MAP Inference
Lecture 7: Constrained Conditional Models
Probabilistic Inference Modulo Theories
An Introduction to Markov Logic Networks in Knowledge Bases
3. The Logic of Quantified Statements Summary
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Bogdan Moldovan, Ingo Thon, Jesse Davis, and Luc de Raedt
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Inference in Bayesian Networks
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
Reasoning Under Uncertainty: Conditioning, Bayes Rule & Chain Rule
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 29
Fundamentals of Functional Programming Languages
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Representing Uncertainty
Learning Markov Networks
Probability Topics Random Variables Joint and Marginal Distributions
Anytime Lifted Belief Propagation
Inference Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation: B.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Graduate School of Information Sciences, Tohoku University
9.5 Parametric Equations.
First Order Probabilistic Inference, by David Poole (2003)
Markov Networks.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Presentation transcript:

Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005

Key Ideas Do exact inference at the first-order level, rather than grounding out the network When we have no evidence concerning many objects, we can treat them identically Allows for queries that primarily depend on the size of a domain

Background: variable elimination Key idea: compute exact marginal probability by iteratively summing out variables. Example: want to compute P(A,C) in a Markov network: A B D C

Background: variable elimination 1. Distribute across sums:

Background: variable elimination 1. Distribute across sums: 2. Sum out D:

Background: variable elimination 1. Distribute across sums: 2. Sum out D:

Background: variable elimination 1. Distribute across sums: 2. Sum out D: 3. Sum out B:

Background: variable elimination 1. Distribute across sums: 2. Sum out D: 3. Sum out B:

First-order variable elimination Instead of factors  and , use parameterized factors, or parfactors:  – potential function A – set of atoms (may be parameterized) C – set of constraints on groundings  of the atoms in A

Example parfactor Given MLN clause: {w, Friends(A,B) => Friends(B,A)} One parfactor might be: : 0.7 if Friends(A,B) => Friends(B,A) is true 0.3 otherwise A: Friends(A,B), Friends(B,A) C: A != bob

Definitions Logical variable: a predicate parameter (e.g., in Friends(A, B), A and B are logical variables) Notation: LV(S)  logical vars in S Random variable: the value of a functor (e.g., Friends(anna, bob) is a random variable in the Friends/Smokes domain) Notation: RV(S)  random vars in S

Joint distribution We can represent any MLN as a set of parfactors G. Joint probability of a world: All random variables (i.e., predicate truth assignments) in the world

Joint distribution We can represent any MLN as a set of parfactors G. Joint probability of a world: All random variables (i.e., predicate truth assignments) in the world Parfactors Groundings

Joint distribution We can represent any MLN as a set of parfactors G. Joint probability of a world: Potential of ground atoms All random variables (i.e., predicate truth assignments) in the world Parfactors Groundings

Query probability TODO – fix, clarify

Query probability TODO – fix, clarify Split apart summation

Query probability Push first factor before summation TODO – fix, clarify Push first factor before summation

Query probability Substitute parfactor g’:

How to find g’? Inversion elimination Counting elimination Complexity is independent of number of groundings Not always applicable Counting elimination Always applicable Requires computing multinomial distributions (potentially large factorials)

Shattering Using unification, split parfactors as necessary to ensure two conditions: For every atom pair (p,q) in G, RV(p) and RV(q) are either identical or disjoint. Good: Friends(anna, B) and Friends(bob, B) Bad: Friends(anna, B) and Friends(A, bob) Incomplete overlap prevents us from reordering terms for inversion elimination.

Shattering (cont.) The second condition is used by counting elimination: For every atom pair (p,q) in every parfactor g in G, p and q are never instantiated to the same random variable. Good: Friends(A, B) and Friends(B, A), A != B Bad: Friends(A, B) and Friends(B, A) Friends(A,B) and Friends(B,A) may be instantiated to the same random variable, e.g., Friends(anna, anna), and thus are not independent.

Inversion elimination Requirements E = {e} (a single atom, possibly parameterized) LV(e) = LV(g) (all logical variables that appear in e’s parfactor, g, also appear as parameters in e) Example: suppose Ag = {Friends(A,B), Smokes(A), and Smokes(B)} Good: E = {Friends(A,B)} Bad: E = {Smokes(B)} Fewer instantiations of Smokes(B) than of parfactor g, since g is over more logical variables.

Inversion Elimination Because our parfactors were shattered, every single term is independent, allowing us to invert the sum and the product: (see paper for full details)

Counting elimination Inversion elimination cannot be applied to: Ag = {Professor(A), IsQualsCourse(B)} Set E consists of multiple atoms, so that remaining atoms in g are ground. Each atom may take on one of several values (e.g., for predicates, True or False) Key idea: Sum out atoms by counting the number of groundings for each configuration (independent assignment of atoms to values). (See paper for further details.)