Download presentation
Presentation is loading. Please wait.
1
Markov Logic Networks: A Step Towards a Unified Theory of Learning and Cognition
Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner
2
One Algorithm Observation: The cortex has the same architecture throughout Hypothesis: A single learning/inference algorithm underlies all cognition Let’s discover it!
3
The Neuroscience Approach
Map the brain and figure out how it works Problem: We don’t know nearly enough
4
The Engineering Approach
Pick a task (e.g., object recognition) Figure out how to solve it Generalize to other tasks Problem: Any one task is too impoverished
5
The Foundational Approach
Consider all tasks the brain does Figure out what they have in common Formalize, test and repeat Advantage: Plenty of clues Where to start? The grand aim of science is to cover the greatest number of experimental facts by logical deduction from the smallest number of hypotheses or axioms. Albert Einstein
6
Recurring Themes Noise, uncertainty, incomplete information → Probability / Graphical models Complexity, many objects & relations → First-order logic
7
Examples Field Logical approach Statistical approach
Knowledge representation First-order logic Graphical models Automated reasoning Satisfiability testing Markov chain Monte Carlo Machine learning Inductive logic programming Neural networks Planning Classical planning Markov decision processes Natural language processing Definite clause grammars Prob. context-free grammars 7
8
Research Plan Unify graphical models and first-order logic
Develop learning and inference algorithms Apply to wide variety of problems This talk: Markov logic networks
9
Markov Logic Networks MLN = Set of 1st-order formulas with weights
Formula = Feature template (Vars→Objects) E.g., Ising model: Most graphical models are special cases First-order logic is infinite-weight limit Up(x) ^ Neighbor(x,y) => Up(y) Weight of formula i No. of true instances of formula i in x
10
MLN Algorithms: The First Three Generations
Problem First generation Second generation Third generation MAP inference Weighted satisfiability Lazy inference Cutting planes Marginal inference Gibbs sampling MC-SAT Lifted inference Weight learning Pseudo-likelihood Voted perceptron Scaled conj. gradient Structure learning Inductive logic progr. ILP + PL (etc.) Clustering + pathfinding
11
Weighted Satisfiability
SAT: Find truth assignment that makes all formulas (clauses) true Huge amount of research on this problem State of the art: Millions of vars/clauses in minutes MaxSAT: Make as many clauses true as possible Weighted MaxSAT: Clauses have weights; maximize satisfied weight MAP inference in MLNs is just weighted MaxSAT Best current solver: MaxWalkSAT
12
MC-SAT Deterministic dependences break MCMC
In practice, even strong probabilistic ones do Swendsen-Wang: Introduce aux. vars. u to represent constraints among x Alternately sample u | x and x | u. But Swendsen-Wang only works for Ising models MC-SAT: Generalize S-W to arbitrary clauses Uses SAT solver to sample x | u. Orders of magnitude faster than Gibbs sampling, etc.
13
Lifted Inference Consider belief propagation (BP)
Often in large problems, many nodes are interchangeable: They send and receive the same messages throughout BP Basic idea: Group them into supernodes, forming lifted network Smaller network → Faster inference Akin to resolution in first-order logic
14
Belief Propagation Features (f) Nodes (x) 14
15
Lifted Belief Propagation
Features (f) Nodes (x) 15
16
Lifted Belief Propagation
, : Functions of edge counts Features (f) Nodes (x) 16
17
Weight Learning Pseudo-likelihood + L-BFGS is fast and robust but can give poor inference results Voted perceptron: Gradient descent + MAP inference Problem: Multiple modes Not alleviated by contrastive divergence Alleviated by MC-SAT Start each MC-SAT run at previous end state
18
Weight Learning (contd.)
Problem: Extreme ill-conditioning Solvable by quasi-Newton, conjugate gradient, etc. But line searches require exact inference Stochastic gradient not applicable because data not i.i.d. Solution: Scaled conjugate gradient Use Hessian to choose step size Compute quadratic form inside MC-SAT Use inverse diagonal Hessian as preconditioner
19
Structure Learning Standard inductive logic programming optimizes the wrong thing But can be used to overgenerate for L1 pruning Our approach: ILP + Pseudo-likelihood + Structure priors For each candidate structure change: Start from current weights & relax convergence Use subsampling to compute sufficient statistics Search methods: Beam, shortest-first, etc.
20
Applications to Date Natural language processing
Information extraction Entity resolution Link prediction Collective classification Social network analysis Robot mapping Activity recognition Scene analysis Computational biology Probabilistic Cyc Personal assistants Etc.
21
Unsupervised Semantic Parsing
Goal Microsoft buys Powerset. BUYS(MICROSOFT,POWERSET) Challenge Microsoft buys Powerset Microsoft acquires semantic search engine Powerset Powerset is acquired by Microsoft Corporation The Redmond software giant buys Powerset Microsoft’s purchase of Powerset, … USP Recursively cluster expressions composed of similar subexpressions Evaluation Extract knowledge from biomedical abstracts and answer questions Substantially outperforms state of the art Three times as many correct answers; accuracy 88%
22
Research Directions Compact representations
Deep architectures Boolean decision diagrams Arithmetic circuits Unified inference procedure Learning MLNs with many latent variables Tighter integration of learning and inference End-to-end NLP system Complete agent
23
Resources Open-source software/Web site: Alchemy
Learning and inference algorithms Tutorials, manuals, etc. MLNs, datasets, etc. Publications Book: Domingos & Lowd, Markov Logic, Morgan & Claypool, 2009. alchemy.cs.washington.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.