Markov Logic And other SRL Approaches

Slides:



Advertisements
Similar presentations
Big Ideas in Cmput366. Search Blind Search State space representation Iterative deepening Heuristic Search A*, f(n)=g(n)+h(n), admissible heuristics Local.
Advertisements

Discriminative Training of Markov Logic Networks
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 March, 25, 2015 Slide source: from Pedro Domingos UW.
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis Lecture)
Markov Logic Networks Instructor: Pedro Domingos.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
Markov Logic: Combining Logic and Probability Parag Singla Dept. of Computer Science & Engineering Indian Institute of Technology Delhi.
Islam Beltagy, Cuong Chau, Gemma Boleda, Dan Garrette, Katrin Erk, Raymond Mooney The University of Texas at Austin Richard Montague Andrey Markov Montague.
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Markov Networks.
Unifying Logical and Statistical AI Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok,
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Markov Logic Networks Hao Wu Mariyam Khalid. Motivation.
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
SAT ∩ AI Henry Kautz University of Rochester. Outline Ancient History: Planning as Satisfiability The Future: Markov Logic.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Practical Statistical Relational Learning Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Statistical Relational Learning Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Inference. Overview The MC-SAT algorithm Knowledge-based model construction Lazy inference Lifted inference.
Recursive Random Fields Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Unifying Logical and Statistical AI Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Stanley Kok, Daniel Lowd,
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
Markov Logic Networks: A Unified Approach To Language Processing Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with.
Markov Logic: A Simple and Powerful Unification Of Logic and Probability Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
1 Human Detection under Partial Occlusions using Markov Logic Networks Raghuraman Gopalan and William Schwartz Center for Automation Research University.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics Islam Beltagy and Katrin Erk The University of Texas at Austin IWCS 2015.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
1 Learning the Structure of Markov Logic Networks Stanley Kok & Pedro Domingos Dept. of Computer Science and Eng. University of Washington.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Pedro Domingos Dept. of Computer Science & Eng.
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
Markov Logic: A Unifying Language for Information and Knowledge Management Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
Relational Probability Models Brian Milch MIT 9.66 November 27, 2007.
Machine Learning For the Web: A Unified View Pedro Domingos Dept. of Computer Science & Eng. University of Washington Includes joint work with Stanley.
SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
Statistical Modeling Of Relational Data Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Markov Random Fields Probabilistic Models for Images
Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
First-Order Logic and Inductive Logic Programming.
1 Markov Logic Stanley Kok Dept. of Computer Science & Eng. University of Washington Joint work with Pedro Domingos, Daniel Lowd, Hoifung Poon, Matt Richardson,
CPSC 322, Lecture 31Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 25, 2015 Slide source: from Pedro Domingos UW & Markov.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 Nov, 23, 2015 Slide source: from Pedro Domingos UW.
Markov Logic Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
Markov Logic: A Representation Language for Natural Language Semantics Pedro Domingos Dept. Computer Science & Eng. University of Washington (Based on.
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
First Order Representations and Learning coming up later: scalability!
Scalable Statistical Relational Learning for NLP William Y. Wang William W. Cohen Machine Learning Dept and Language Technologies Inst. joint work with:
Practical Statistical Relational Learning Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
New Rules for Domain Independent Lifted MAP Inference
An Introduction to Markov Logic Networks in Knowledge Bases
Markov Logic Networks for NLP CSCI-GA.2591
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 29
First-Order Logic and Inductive Logic Programming
Logic for Artificial Intelligence
Markov Networks.
Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005.
Learning Markov Networks
Markov Networks.
Mostly pilfered from Pedro’s slides
Markov Networks.
Presentation transcript:

Markov Logic And other SRL Approaches

Overview Statistical relational learning Markov logic Basic inference Basic learning

Statistical Relational Learning Goals: Combine (subsets of) logic and probability into a single language Develop efficient inference algorithms Develop efficient learning algorithms Apply to real-world problems L. Getoor & B. Taskar (eds.), Introduction to Statistical Relational Learning, MIT Press, 2007.

Plethora of Approaches Knowledge-based model construction [Wellman et al., 1992] Stochastic logic programs [Muggleton, 1996] Probabilistic relational models [Friedman et al., 1999] Relational Markov networks [Taskar et al., 2002] Bayesian logic [Milch et al., 2005] Markov logic [Richardson & Domingos, 2006] And many others!

Key Dimensions Logical language First-order logic, Horn clauses, frame systems Probabilistic language Bayes nets, Markov nets, PCFGs Type of learning Generative / Discriminative Structure / Parameters Knowledge-rich / Knowledge-poor Type of inference MAP / Marginal Full grounding / Partial grounding / Lifted

Knowledge-Based Model Construction Logical language: Horn clauses Probabilistic language: Bayes nets Ground atom → Node Head of clause → Child node Body of clause → Parent nodes >1 clause w/ same head → Combining function Learning: ILP + EM Inference: Partial grounding + Belief prop.

Stochastic Logic Programs Logical language: Horn clauses Probabilistic language: Probabilistic context-free grammars Attach probabilities to clauses Σ Probs. of clauses w/ same head = 1 Learning: ILP + “Failure-adjusted” EM Inference: Do all proofs, add probs.

Probabilistic Relational Models Logical language: Frame systems Probabilistic language: Bayes nets Bayes net template for each class of objects Object’s attrs. can depend on attrs. of related objs. Only binary relations No dependencies of relations on relations Learning: Parameters: Closed form (EM if missing data) Structure: “Tiered” Bayes net structure search Inference: Full grounding + Belief propagation

Relational Markov Networks Logical language: SQL queries Probabilistic language: Markov nets SQL queries define cliques Potential function for each query No uncertainty over relations Learning: Discriminative weight learning No structure learning Inference: Full grounding + Belief prop.

Bayesian Logic Logical language: First-order semantics Probabilistic language: Bayes nets BLOG program specifies how to generate relational world Parameters defined separately in Java functions Allows unknown objects May create Bayes nets with directed cycles Learning: None to date Inference: MCMC with user-supplied proposal distribution Partial grounding

Markov Logic Logical language: First-order logic Probabilistic language: Markov networks Syntax: First-order formulas with weights Semantics: Templates for Markov net features Learning: Parameters: Generative or discriminative Structure: ILP with arbitrary clauses and MAP score Inference: MAP: Weighted satisfiability Marginal: MCMC with moves proposed by SAT solver Partial grounding + Lazy inference / Lifted inference

Markov Logic Most developed approach to date Many other approaches can be viewed as special cases Main focus of rest of this class

Markov Logic: Intuition A logical KB is a set of hard constraints on the set of possible worlds Let’s make them soft constraints: When a world violates a formula, It becomes less probable, not impossible Give each formula a weight (Higher weight  Stronger constraint)

Markov Logic: Definition A Markov Logic Network (MLN) is a set of pairs (F, w) where F is a formula in first-order logic w is a real number Together with a set of constants, it defines a Markov network with One node for each grounding of each predicate in the MLN One feature for each grounding of each formula F in the MLN, with the corresponding weight w

Example: Friends & Smokers

Example: Friends & Smokers

Example: Friends & Smokers

Example: Friends & Smokers Two constants: Anna (A) and Bob (B)

Example: Friends & Smokers Two constants: Anna (A) and Bob (B) Smokes(A) Smokes(B) Cancer(A) Cancer(B)

Example: Friends & Smokers Two constants: Anna (A) and Bob (B) Friends(A,B) Friends(A,A) Smokes(A) Smokes(B) Friends(B,B) Cancer(A) Cancer(B) Friends(B,A)

Example: Friends & Smokers Two constants: Anna (A) and Bob (B) Friends(A,B) Friends(A,A) Smokes(A) Smokes(B) Friends(B,B) Cancer(A) Cancer(B) Friends(B,A)

Example: Friends & Smokers Two constants: Anna (A) and Bob (B) Friends(A,B) Friends(A,A) Smokes(A) Smokes(B) Friends(B,B) Cancer(A) Cancer(B) Friends(B,A)

Markov Logic Networks MLN is template for ground Markov nets Probability of a world x: Typed variables and constants greatly reduce size of ground Markov net Functions, existential quantifiers, etc. Infinite and continuous domains Weight of formula i No. of true groundings of formula i in x

Relation to Statistical Models Special cases: Markov networks Markov random fields Bayesian networks Log-linear models Exponential models Max. entropy models Gibbs distributions Boltzmann machines Logistic regression Hidden Markov models Conditional random fields Obtained by making all predicates zero-arity Markov logic allows objects to be interdependent (non-i.i.d.)

Relation to First-Order Logic Infinite weights  First-order logic Satisfiable KB, positive weights  Satisfying assignments = Modes of distribution Markov logic allows contradictions between formulas

MAP Inference Problem: Find most likely state of world given evidence Query Evidence

MAP Inference Problem: Find most likely state of world given evidence

MAP Inference Problem: Find most likely state of world given evidence

MAP Inference Problem: Find most likely state of world given evidence This is just the weighted MaxSAT problem Use weighted SAT solver (e.g., MaxWalkSAT [Kautz et al., 1997]

The MaxWalkSAT Algorithm for i ← 1 to max-tries do solution = random truth assignment for j ← 1 to max-flips do if ∑ weights(sat. clauses) > threshold then return solution c ← random unsatisfied clause with probability p flip a random variable in c else flip variable in c that maximizes ∑ weights(sat. clauses) return failure, best solution found

Computing Probabilities P(Formula|MLN,C) = ? Brute force: Sum probs. of worlds where formula holds MCMC: Sample worlds, check formula holds P(Formula1|Formula2,MLN,C) = ? Discard worlds where Formula 2 does not hold In practice: More efficient alternatives

Learning Data is a relational database For now: Closed world assumption (if not: EM) Learning parameters (weights) Similar to learning weights for Markov networks Learning structure (formulas) A form of inductive logic programming Also related to learning features for Markov nets

Weight Learning Parameter tying: Groundings of same clause Generative learning: Pseudo-likelihood Discriminative learning: Cond. likelihood, use MaxWalkSAT for inference No. of times clause i is true in data Expected no. times clause i is true according to MLN

Alchemy Open-source software including: Full first-order logic syntax Inference (MAP and conditional probabilities) Weight learning (generative and discriminative) Structure learning Programming language features alchemy.cs.washington.edu

Alchemy Prolog BUGS Represent-ation F.O. Logic + Markov nets Horn clauses Bayes nets Inference Satisfiability, MCMC, BP Theorem proving Gibbs sampling Learning Parameters & structure No Params. Uncertainty Yes Relational