Speeding Up Inference in Markov Logic Networks by Preprocessing to Reduce the Size of the Resulting Grounded Network Jude Shavlik Sriraam Natarajan Computer.

Slides:



Advertisements
Similar presentations
Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)
Advertisements

Discriminative Training of Markov Logic Networks
Efficient Inference Methods for Probabilistic Logical Models
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
Bayesian Abductive Logic Programs Sindhu Raghavan Raymond J. Mooney The University of Texas at Austin 1.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 March, 25, 2015 Slide source: from Pedro Domingos UW.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis Lecture)
Markov Logic Networks Instructor: Pedro Domingos.
Department of Computer Science The University of Texas at Austin Probabilistic Abduction using Markov Logic Networks Rohit J. Kate Raymond J. Mooney.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
Markov Logic: Combining Logic and Probability Parag Singla Dept. of Computer Science & Engineering Indian Institute of Technology Delhi.
Islam Beltagy, Cuong Chau, Gemma Boleda, Dan Garrette, Katrin Erk, Raymond Mooney The University of Texas at Austin Richard Montague Andrey Markov Montague.
Random Regression: Example Target Query: P(gender(sam) = F)? Sam is friends with Bob and Anna. Unnormalized Probability: Oliver Schulte, Hassan Khosravi,
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Adbuctive Markov Logic for Plan Recognition Parag Singla & Raymond J. Mooney Dept. of Computer Science University of Texas, Austin.
Efficient Weight Learning for Markov Logic Networks Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
School of Computing Science Simon Fraser University Vancouver, Canada.
Learning Markov Network Structure with Decision Trees Daniel Lowd University of Oregon Jesse Davis Katholieke Universiteit Leuven Joint work with:
Plan Recognition with Multi- Entity Bayesian Networks Kathryn Blackmond Laskey Department of Systems Engineering and Operations Research George Mason University.
Searching for the Minimal Bézout Number Lin Zhenjiang, Allen Dept. of CSE, CUHK 3-Oct-2005
Max-norm Projections for Factored MDPs Carlos Guestrin Daphne Koller Stanford University Ronald Parr Duke University.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Recursive Random Fields Daniel Lowd University of Washington June 29th, 2006 (Joint work with Pedro Domingos)
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Inference. Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference.
Search in the semantic domain. Some definitions atomic formula: smallest formula possible (no sub- formulas) literal: atomic formula or negation of an.
Recursive Random Fields Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Unifying Logical and Statistical AI Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Stanley Kok, Daniel Lowd,
Last time Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search strategy.
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
1 Human Detection under Partial Occlusions using Markov Logic Networks Raghuraman Gopalan and William Schwartz Center for Automation Research University.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
1 Learning the Structure of Markov Logic Networks Stanley Kok & Pedro Domingos Dept. of Computer Science and Eng. University of Washington.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Boosting Markov Logic Networks
Factor Graphs Young Ki Baik Computer Vision Lab. Seoul National University.
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
Lisa Torrey and Jude Shavlik University of Wisconsin Madison WI, USA.
Relational Probability Models Brian Milch MIT 9.66 November 27, 2007.
Web Query Disambiguation from Short Sessions Lilyana Mihalkova* and Raymond Mooney University of Texas at Austin *Now at University of Maryland College.
Performing Bayesian Inference by Weighted Model Counting Tian Sang, Paul Beame, and Henry Kautz Department of Computer Science & Engineering University.
Dr. Matthew Iklé Department of Mathematics and Computer Science Adams State College Probabilistic Quantifier Logic for General Intelligence: An Indefinite.
Markov Logic And other SRL Approaches
Transfer in Reinforcement Learning via Markov Logic Networks Lisa Torrey, Jude Shavlik, Sriraam Natarajan, Pavan Kuppili, Trevor Walker University of Wisconsin-Madison,
ARTIFICIAL INTELLIGENCE [INTELLIGENT AGENTS PARADIGM] Professor Janis Grundspenkis Riga Technical University Faculty of Computer Science and Information.
Tuffy Scaling up Statistical Inference in Markov Logic using an RDBMS
Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz SRI International joint work with Eyal Amir and Dan Roth.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
1 Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz University of Illinois at Urbana-Champaign with Eyal Amir and Dan Roth.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
Today’s Topics 12/1/15CS Fall 2015 (Shavlik©), Lecture 27, Week 131 Read Chapter 21 (skip Section 21.5) of textbook Exam THURSDAY Dec 17, 5:30-7:30pm.
CPSC 322, Lecture 31Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 25, 2015 Slide source: from Pedro Domingos UW & Markov.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 Nov, 23, 2015 Slide source: from Pedro Domingos UW.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
Markov Logic: A Representation Language for Natural Language Semantics Pedro Domingos Dept. Computer Science & Eng. University of Washington (Based on.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Probabilistic Reasoning Inference and Relational Bayesian Networks.
Happy Mittal Advisor : Parag Singla IIT Delhi Lifted Inference Rules With Constraints.
An Introduction to Markov Logic Networks in Knowledge Bases
Markov Logic Networks for NLP CSCI-GA.2591
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 29
Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005.
Louis Oliphant and Jude Shavlik
Presentation transcript:

Speeding Up Inference in Markov Logic Networks by Preprocessing to Reduce the Size of the Resulting Grounded Network Jude Shavlik Sriraam Natarajan Computer Sciences Department University of Wisconsin, Madison USA

Markov Logic Networks (Richardson & Domingos, MLj 2006) A probabilistic, first-order logic Key idea compactly represent large graphical models using weight = w  x, y, z f(x, y, z) Standard approach 1) assume finite number of constants 2) create all possible groundings 3) perform statistical inference (often via sampling) Univ of WisconsinShavlik & Natarajan, IJCAI-092

The Challenge We Address Creating all possible groundings can be daunting A story … Given: an MLN and data Do:quickly find an equivalent, reduced MLN Univ of WisconsinShavlik & Natarajan, IJCAI-093

Computing Probabilities in MLNs Probability( World S ) = ( 1 / Z )  exp {  weight i x numberTimesTrue(f i, S) } i  formulae Univ of WisconsinShavlik & Natarajan, IJCAI-094

Counting Satisfied Groundings Typically lots of redundancy in FOL sentences  x, y, z p(x) ⋀ q(x, y, z) ⋀ r(z)  w(x, y, z) If p(John) = false, then formula = true for all Y and Z values Univ of WisconsinShavlik & Natarajan, IJCAI-095

Some Terminology Three kinds of literals (‘predicates’) Evidence: truth value known Query:want to know prob’s of these Hidden:other Univ of WisconsinShavlik & Natarajan, IJCAI-096

e Bi e B1 + … + e Bn Let A =weighted sum of formula satisfied by evidence Let B i =weighted sum of formula in world i not satisfied by evidence Prob(world i ) = e A + Bi e A + B1 + … + e A + Bn Factoring Out the Evidence Univ of WisconsinShavlik & Natarajan, IJCAI-097

Key Idea of Our FROG Algorithm Efficiently factor out those formula groundings that evidence satisfies Can produce many orders-of-magnitude smaller Markov networks Can eliminate need for approximate inference, if resulting Markov net small/disconnected enough Resulting Markov net compatible with other speed- up methods, such as lifted and lazy inference, knowledge-based model construction Univ of WisconsinShavlik & Natarajan, IJCAI-098

Worked Example  x, y, z GradStudent(x) ⋀ Prof(y) ⋀ Prof(z) ⋀ TA(x, z) ⋀ SameGroup(y, z)  AdvisedBy(x, y) 10,000People at some school 2000Graduate students 1000Professors 1000TAs 500Pairs of professors in the same group Total Num of Groundings = |x|  |y|  |z| = The Evidence Univ of WisconsinShavlik & Natarajan, IJCAI-099

10 12 ¬ GradStudent(P2) ¬ GradStudent(P4) … 2 × GradStudent(x) GradStudent(P1) ¬ GradStudent(P2) GradStudent(P3) … True False GradStudent(P1) GradStudent(P3) … 2000 Grad Students 8000 Others All these values for X satisfy the clause, regardless of Y and Z GradStudent(x) ⋀ Prof(y) ⋀ Prof(z) ⋀ TA(x,z) ⋀ SameGroup(y,z)  AdvisedBy(x,y) FROG keeps only these X values Instead of 10 4 values for X, have 2 x 10 3 Univ of WisconsinShavlik & Natarajan, IJCAI-0910

2 × × Prof(y) ¬ Prof(P1) Prof(P2) … Prof(P2) … 1000 Professors ¬ Prof(P1) … 9000 Others GradStudent(x) ⋀ Prof(y) ⋀ Prof(z) ⋀ TA(x,z) ⋀ SameGroup(y,z)  AdvisedBy(x,y) True False Univ of WisconsinShavlik & Natarajan, IJCAI-0911

2 × × 10 9 GradStudent(x) ⋀ Prof(y) ⋀ Prof(z) ⋀ TA(x,z) ⋀ SameGroup(y,z)  AdvisedBy(x,y) >> Univ of WisconsinShavlik & Natarajan, IJCAI-0912

2 × × 10 6 SameGroup(y, z) 10 6 Combinations SameGroup(P1, P2) … 1000 true SameGroup’s ¬ SameGroup(P2, P5) … 10 6 – 1000 Others GradStudent(x) ⋀ Prof(y) ⋀ Prof(z) ⋀ TA(x,z) ⋀ SameGroup(y,z)  AdvisedBy(x,y) True False 2000 values of X 1000 Y:Z combinations Univ of WisconsinShavlik & Natarajan, IJCAI-0913

TA(x, z) 2 × 10 6 Combinations TA(P7,P5) … 1000 TA’s ¬ TA(P8,P4) … 2 × 10 6 – 1000 Others ≤ 10 6 GradStudent(x) ⋀ Prof(y) ⋀ Prof(z) ⋀ TA(x,z) ⋀ SameGroup(y,z)  AdvisedBy(x,y) True False ≤ 1000 values of X ≤ 1000 Y:Z combinations Univ of WisconsinShavlik & Natarajan, IJCAI-0914

Original number of groundings = GradStudent(x) ⋀ Prof(y) ⋀ Prof(z) ⋀ TA(x,z) ⋀ SameGroup(y,z)  AdvisedBy(x,y) Final number of groundings ≤ 10 6 Univ of WisconsinShavlik & Natarajan, IJCAI-0915

Some Algorithmic Details Initially store groundings with 10 4 space Storage needs grow because literals cause variables to ‘interact’ P(x, y, z) might require O(10 12 ) space Order literals ‘reduced’ impacts storage needs Simple heuristic (see paper) chooses literal to process next – or try all permutations Can merge inference rules after reduction After reduction, sample rule only has advisedBy(x,y) Univ of WisconsinShavlik & Natarajan, IJCAI-0916

Empirical Results: CiteSeer Fully Grounded Net FROG’s Reduced Net Univ of WisconsinShavlik & Natarajan, IJCAI-0917

Empirical Results: UWash-CSE FROG’s Reduced Net without One Challenging Rule FROG’s Reduced Net Fully Grounded Net advisedBy(x,y)  advisedBy(x,z)  samePerson(y,z)) Univ of Wisconsin18Shavlik & Natarajan, IJCAI-09

Runtimes On Full UWash-CSE (27 rules) FROG takes 4.2 sec On CORA (2K rules) and CiteSeer (8K rules) FROG takes less than 700 msec per rule On CORA Alchemy’s Lazy Inference takes 94 mins to create its initial network FROG takes 30 mins and produces small enough network (10 6 nodes) that lazy inference not needed Univ of WisconsinShavlik & Natarajan, IJCAI-0919

Related Work Lazy MLN inference Singla & Domingos (2006), Poon et al (2008) FROG: precompute instead of lazily calculate Lifted inference Braz et al (2005), Singla & Domingos (2008), Milch et al (2008), Riedel (2008), Kisynski & Poole (2009), Kersting et al (2009) Knowledge-based model construction Wellman et al (1992) FROG also exploits KBMC Univ of WisconsinShavlik & Natarajan, IJCAI-0920

Future Work Efficiently handle small changes to truth values of evidence Combine FROG with Lifted Inference Exploit commonality across rules Integrate with weight and rule learning Univ of WisconsinShavlik & Natarajan, IJCAI-0921

Conclusion MLN’s count the satisfied groundings of FOL formula Many ways a formula can be satisfied P(x)  Q(x, y)  R(x, y, z)  ¬ S(y)  ¬ T(x, y) Our FROG algorithm efficiently counts groundings satisfied by evidence FROG can reduce number of groundings by several orders of magnitude Reduced network compatible with lifted and lazy inference, etc Univ of WisconsinShavlik & Natarajan, IJCAI-0922