Recursive Random Fields Daniel Lowd University of Washington (Joint work with Pedro Domingos)

Slides:



Advertisements
Similar presentations
Discriminative Training of Markov Logic Networks
Advertisements

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 March, 25, 2015 Slide source: from Pedro Domingos UW.
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
Everything You Need to Know (since the midterm). Diagnosis Abductive diagnosis: a minimal set of (positive and negative) assumptions that entails the.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis Lecture)
Markov Logic Networks Instructor: Pedro Domingos.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Speeding Up Inference in Markov Logic Networks by Preprocessing to Reduce the Size of the Resulting Grounded Network Jude Shavlik Sriraam Natarajan Computer.
Markov Networks.
Efficient Weight Learning for Markov Logic Networks Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Unifying Logical and Statistical AI Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok,
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
School of Computing Science Simon Fraser University Vancouver, Canada.
Learning Markov Network Structure with Decision Trees Daniel Lowd University of Oregon Jesse Davis Katholieke Universiteit Leuven Joint work with:
1 Applied Computer Science II Resolution in FOL Luc De Raedt.
Outline Recap Knowledge Representation I Textbook: Chapters 6, 7, 9 and 10.
Statistical Relational Learning Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Recursive Random Fields Daniel Lowd University of Washington June 29th, 2006 (Joint work with Pedro Domingos)
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
Markov Logic Networks: A Unified Approach To Language Processing Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with.
Markov Logic: A Simple and Powerful Unification Of Logic and Probability Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics Islam Beltagy and Katrin Erk The University of Texas at Austin IWCS 2015.
1 Learning the Structure of Markov Logic Networks Stanley Kok & Pedro Domingos Dept. of Computer Science and Eng. University of Washington.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Pedro Domingos Dept. of Computer Science & Eng.
Boosting Markov Logic Networks
The Foundations: Logic and Proofs
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
Markov Logic: A Unifying Language for Information and Knowledge Management Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
Proof Systems KB |- Q iff there is a sequence of wffs D1,..., Dn such that Dn is Q and for each Di in the sequence: a) either Di is in KB or b) Di can.
Machine Learning For the Web: A Unified View Pedro Domingos Dept. of Computer Science & Eng. University of Washington Includes joint work with Stanley.
Dr. Matthew Iklé Department of Mathematics and Computer Science Adams State College Probabilistic Quantifier Logic for General Intelligence: An Indefinite.
Markov Logic And other SRL Approaches
Transfer in Reinforcement Learning via Markov Logic Networks Lisa Torrey, Jude Shavlik, Sriraam Natarajan, Pavan Kuppili, Trevor Walker University of Wisconsin-Madison,
Chapter 1, Part II: Predicate Logic With Question/Answer Animations.
Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Chapter 1, Part II: Predicate Logic With Question/Answer Animations.
Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz SRI International joint work with Eyal Amir and Dan Roth.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
Advice Taking and Transfer Learning: Naturally-Inspired Extensions to Reinforcement Learning Lisa Torrey, Trevor Walker, Richard Maclin*, Jude Shavlik.
LDK R Logics for Data and Knowledge Representation Propositional Logic: Reasoning First version by Alessandro Agostini and Fausto Giunchiglia Second version.
Modeling Speech Acts and Joint Intentions in Modal Markov Logic Henry Kautz University of Washington.
First-Order Logic and Inductive Logic Programming.
Today’s Topics 12/1/15CS Fall 2015 (Shavlik©), Lecture 27, Week 131 Read Chapter 21 (skip Section 21.5) of textbook Exam THURSDAY Dec 17, 5:30-7:30pm.
CPSC 322, Lecture 31Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 25, 2015 Slide source: from Pedro Domingos UW & Markov.
For Wednesday Read chapter 9, sections 1-3 Homework: –Chapter 7, exercises 8 and 9.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 Nov, 23, 2015 Slide source: from Pedro Domingos UW.
Inference in First Order Logic. Outline Reducing first order inference to propositional inference Unification Generalized Modus Ponens Forward and backward.
Knowledge Repn. & Reasoning Lec. #5: First-Order Logic UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2004.
Markov Logic Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
First-Order Logic Semantics Reading: Chapter 8, , FOL Syntax and Semantics read: FOL Knowledge Engineering read: FOL.
Markov Logic: A Representation Language for Natural Language Semantics Pedro Domingos Dept. Computer Science & Eng. University of Washington (Based on.
Scalable Statistical Relational Learning for NLP William Y. Wang William W. Cohen Machine Learning Dept and Language Technologies Inst. joint work with:
An Introduction to Markov Logic Networks in Knowledge Bases
Markov Logic Networks for NLP CSCI-GA.2591
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
First-Order Logic and Inductive Logic Programming
Logic for Artificial Intelligence
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 20
Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005.
Learning Markov Networks
Markov Networks.
Presentation transcript:

Recursive Random Fields Daniel Lowd University of Washington (Joint work with Pedro Domingos)

One-Slide Summary Question: How to represent uncertainty in relational domains? State-of-the-Art: Markov logic [Richardson & Domingos, 2004]  Markov logic network (MLN) = First-order KB with weights: Problem: Only top-level conjunction and universal quantifiers are probabilistic Solution: Recursive random fields (RRFs)  RRF = MLN whose features are MLNs  Inference: Gibbs sampling, iterated conditional modes  Learning: Back-propagation

Overview Example: Friends and Smokers Recursive random fields  Representation  Inference  Learning Experiments: Databases with probabilistic integrity constraints Future work and conclusion

Example: Friends and Smokers Predicates: Smokes(x); Cancer(x); Friends(x,y) We wish to represent beliefs such as: Smoking causes cancer Friends of friends are friends (transitivity) Everyone has a friend who smokes [Richardson and Domingos, 2004]

First-Order Logic  Sm(x)   Ca(x)  Fr(x,y)  Fr(y,z) Fr(x,z)   x x  x,y,z  x x Fr(x,y) Sm(y)   y y Logical

Markov Logic  Sm(x)   Ca(x)  Fr(x,y)  Fr(y,z) Fr(x,z) 1/Z exp(  …)  x x  x,y,z  x x Fr(x,y) Sm(y)   y y Probabilistic Logical w1w1 w2w2 w3w3

Markov Logic  Sm(x)   Ca(x)  Fr(x,y)  Fr(y,z) Fr(x,z) 1/Z exp(  …)  x x  x,y,z  x x Fr(x,y) Sm(y)   y y Probabilistic Logical w1w1 w2w2 w3w3

Markov Logic  Sm(x)   Ca(x)  Fr(x,y)  Fr(y,z) Fr(x,z) 1/Z exp(  …)  x x  x,y,z  x x Fr(x,y) Sm(y)   y y Probabilistic Logical w1w1 w2w2 w3w3 This becomes a disjunction of n conjunctions.

Markov Logic  Sm(x)   Ca(x)  Fr(x,y)  Fr(y,z) Fr(x,z) 1/Z exp(  …)  x x  x,y,z  x x Fr(x,y) Sm(y)   y y Probabilistic Logical w1w1 w2w2 w3w3 In CNF, each grounding explodes into 2 n clauses!

Markov Logic  Sm(x)   Ca(x)  Fr(x,y)  Fr(y,z) Fr(x,z) 1/Z exp(  …)  x x  x,y,z  x x Fr(x,y) Sm(y)   y y Probabilistic Logical w1w1 w2w2 w3w3

Markov Logic  Sm(x)   Ca(x)  Fr(x,y)  Fr(y,z) Fr(x,z) f0f0  x x  x,y,z  x x Fr(x,y) Sm(y)   y y Probabilistic Logical w1w1 w2w2 w3w3 Where: f i (x) = 1/Z i exp(  …)

Recursive Random Fields Sm(x) Ca(x) Fr(x,y)Fr(y,z) Fr(x,z) f0f0  x f 1 (x) Fr(x,y) Sm(y)  y f 4 (x,y) Probabilistic w1w1 w2w2 w3w3  x,y,z f 2 (x,y,z)  x f 3 (x) w4w4 w6w6 w5w5 w7w7 w8w8 w9w9 w 10 w 11 Where: f i (x) = 1/Z i exp(  …)

RRF features are parameterized and are grounded using objects in the domain.  Leaves = Predicates:  Recursive features are built up from other RRF features: The RRF Model

Representing Logic: AND (x 1  …  x n )  1/Z exp(w 1 x 1 + … + w n x n ) 01n … P(World) # true literals

Representing Logic: OR (x 1  …  x n )  1/Z exp(w 1 x 1 + … + w n x n ) (x 1  …  x n )   (  x 1  …   x n )  − 1/Z exp(−w 1 x 1 +… + −w n x n ) De Morgan: (x  y)   (  x   y) 01n … P(World) # true literals

Representing Logic: FORALL (x 1  …  x n )  1/Z exp(w 1 x 1 + … + w n x n ) (x 1  …  x n )   (  x 1  …   x n )  − 1/Z exp(−w 1 x 1 +… + −w n x n )  a: f(a)  1/Z exp(w x 1 + w x 2 + …) 01n … P(World) # true literals

Representing Logic: EXIST (x 1  …  x n )  1/Z exp(w 1 x 1 + … + w n x n ) (x 1  …  x n )   (  x 1  …   x n )  − 1/Z exp(−w 1 x 1 +… + −w n x n )  a: f(a)  1/Z exp(w x 1 + w x 2 + …)  a: f(a)   (  a:  f(a)) −1/Z exp(−w x 1 + −w x 2 + …) 01n … P(World) # true literals

Distributions MLNs and RRFs can compactly represent DistributionMLNsRRFs Propositional MRFYes Deterministic KBYes Soft conjunctionYes Soft universal quantificationYes Soft disjunctionNoYes Soft existential quantificationNoYes Soft nested formulasNoYes

Inference and Learning Inference  MAP: Iterated conditional modes (ICM)  Conditional probabilities: Gibbs sampling Learning  Back-propagation  Pseudo-likelihood  RRF weight learning is more powerful than MLN structure learning (cf. KBANN)  More flexible theory revision

Experiments: Databases with Probabilistic Integrity Constraints Integrity constraints: First-order logic  Inclusion: “If x is in table R, it must also be in table S”  Functional dependency: “In table R, each x determines a unique y” Need to make them probabilistic Perfect application of MLNs/RRFs

Experiment 1: Inclusion Constraints Task: Clean a corrupt database Relations  ProjectLead(x,y) – x is in charge of project y  ManagerOf(x,z) – x manages employee z  Corrupt versions: ProjectLead’(x,y); ManagerOf’(x,z) Constraints  Every project leader manages at least one employee. i.e.,  x.(  y.ProjectLead(x,y))  (  z.Manages(x,z))  Corrupt database is related to original database i.e., ProjectLead(x,y)  ProjectLead’(x,y)

Experiment 1: Inclusion Constraints Data  100 people, 100 projects  25% are managers of ~10 projects each, and manage ~5 employees per project  Added extra ManagerOf(x,y) relations  Predicate truth values flipped with probability p Models  Converted FOL to MLN and RRF  Maximized pseudo-likelihood

Experiment 1: Results

Experiment 2: Functional Dependencies Task: Determine which names are pseudonyms Relation:  Supplier(TaxID,CompanyName,PartType) – Describes a company that supplies parts Constraint  Company names with same TaxID are equivalent i.e.,  x,y 1,y 2.(  z 1,z 2.Supplier(x,y 1,z 1 )  Supplier(x,y 2,z 2 ) )  y 1 = y 2

Experiment 2: Functional Dependencies Data  30 tax IDs, 30 company names, 30 part types  Each company supplies 25% of all part types  Each company has k names  Company names are changed with probability p Models  Converted FOL to MLN and RRF  Maximized pseudo-likelihood

Experiment 2: Results

Future Work Scaling up  Pruning, caching  Alternatives to Gibbs, ICM, gradient descent Experiments with real-world databases  Probabilistic integrity constraints  Information extraction, etc. Extract information a la TREPAN (Craven and Shavlik, 1995)

Conclusion Recursive random fields: – Less intuitive than Markov logic – More computationally costly + Compactly represent many distributions MLNs cannot + Make conjunctions, existentials, and nested formulas probabilistic + Offer new methods for structure learning and theory revision Questions: