An Introduction to Markov Logic Networks in Knowledge Bases

Slides:



Advertisements
Similar presentations
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Advertisements

Markov Networks Alan Ritter.
Discriminative Training of Markov Logic Networks
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
Logic Use mathematical deduction to derive new knowledge.
CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 March, 25, 2015 Slide source: from Pedro Domingos UW.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Review Markov Logic Networks Mathew Richardson Pedro Domingos Xinran(Sean) Luo, u
Markov Networks.
Markov Logic Networks Hao Wu Mariyam Khalid. Motivation.
School of Computing Science Simon Fraser University Vancouver, Canada.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Outline Recap Knowledge Representation I Textbook: Chapters 6, 7, 9 and 10.
Recursive Random Fields Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
Markov Logic: A Simple and Powerful Unification Of Logic and Probability Pedro Domingos Dept. of Computer Science & Eng. University of Washington Joint.
1 Human Detection under Partial Occlusions using Markov Logic Networks Raghuraman Gopalan and William Schwartz Center for Automation Research University.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Clustering Unsupervised learning Generating “classes”
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Markov Logic And other SRL Approaches
Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
Course files
LAC group, 16/06/2011. So far...  Directed graphical models  Bayesian Networks Useful because both the structure and the parameters provide a natural.
Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From AAAI 2008 William Pentney, Department of Computer Science & Engineering University of.
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
First-Order Logic and Inductive Logic Programming.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Oct, 30, 2015 Slide credit: some slides adapted from Stuart.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Markov Logic Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
First-Order Logic Semantics Reading: Chapter 8, , FOL Syntax and Semantics read: FOL Knowledge Engineering read: FOL.
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
Markov Random Fields in Vision
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.
Knowledge Representation Techniques
Chapter 3: Methods of Inference
Control Flow Testing Handouts
Inference in Bayesian Networks
The Propositional Calculus
Chapter 3: Methods of Inference
Markov Logic Networks for NLP CSCI-GA.2591
Outline of the Chapter Basic Idea Outline of Control Flow Testing
First-Order Logic and Inductive Logic Programming
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Learning Bayesian Network Models from Data
Markov Networks.
Lifted First-Order Probabilistic Inference [de Salvo Braz, Amir, and Roth, 2005] Daniel Lowd 5/11/2005.
Learning Markov Networks
Artificial Intelligence
Markov Networks.
CS 188: Artificial Intelligence Spring 2007
Knowledge Representation I (Propositional Logic)
Discriminative Probabilistic Models for Relational Data
Markov Networks.
Representations & Reasoning Systems (RRS) (2.2)
CS 188: Artificial Intelligence Fall 2008
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Presentation transcript:

An Introduction to Markov Logic Networks in Knowledge Bases 9/30/2014 Sean Goldberg sean@cise.ufl.edu

From Big Data to Big Wisdom Data is the constant deluge of sensory information: websites, status updates, e-mails, etc. Knowledge is a set of desciptions of and relations between data, understanding how things are connected. Wisdom is being able to identify patterns in knowledge and use to infer new knowledge and understand causality in the world.

From Big Data to Big Wisdom Data is the constant deluge of sensory information: websites, status updates, e-mails, etc. Knowledge is a set of desciptions of and relations between data, understanding how things are connected. Wisdom is being able to identify patterns in knowledge and use to infer new knowledge and understand causality in the world. Data Facts Rules

What is a Knowledge Base? So with that said? What is a knowledge base? In light of the previous definition, it’s…

What is a Knowledge Base? People….

What is a Knowledge Base? Places…

What is a Knowledge Base? Organizations…

What is a Knowledge Base? Things….

What is a Knowledge Base? And the relations between them all. For clarity and brevity, I’ve left out the names of entities and relations, hoping you can follow along by context. The relation arrows can be of specific types such as “has Ally”, ‘is member of”, or ‘has father’.

Graph Representation Viewing knowledge this way is known as the graph representation of a knowledge base. How it’s stored more practicality…

Triple Representation FatherOf(Darth Vader, Luke Skywalker) memberOf(Luke Skywalker, Rebel Alliance) hasWeapon(Luke Skywalker, lightsaber) isAlliedWith(Luke Skywalker, Han Solo) IsAlliedWith(Luke Skywalker, Chewbacca) livesOn(Luke Skywalker, Tatooine) hasAdviser(Luke Skywalker, Yoda) ….. Is with the triple representation. The relation is a binary function that takes two entities and evaluates to true if the relation is shared by those entities in the knowledge base. Referring to the data one can identify by a relationa and its arguments, or use the subject-predicate-object formalism, where the first argument is the subject, the relation is the predicate, and the second argument is the object.

Large Scale KBs KnowitAll (Univ. of Washington) 500M facts extracted from 100M webpages (ClueWeb) DBPedia (Univ. of Leipzig) 3.5M entities, 700M facts extracted from Wikipedia YAGO (Max-Planck Institute) 2M entities, 20M facts extracted from Wikipedia and WordNet Freebase (Google) 20M entities, 300M facts from integrated data sources and humans NELL (Carnegie Mellon) 850M facts extracted from 500M webpages A knowledge base is any collection of such facts, but what we’re really concerned with is the domain of large-scale KBs. These are created either through automated scraping of data or large crowdsourcing efforts.

Explicit vs. Implicit Information All of the previous databases are really good at identifying explicit information. But what about common sense implicit information? Consider the following small KB with three entities and two facts.

Does Obama live in the US? If we ran the following query, the KB would check for the relation liveIn(Obama, US) and not find it. So it would return a null result or no.

Knowledge Expansion But we know this isn’t entirely right. Obama living in the white house is subsumed by Obama living in the US and one can infer using transitivity that Obama must live in the US. Adding additional knowledge to the KB through the use of inference rules is known as “Knowledge Expansion” and puts us on the road to Wisdom.

Inferring New Facts Markov Network First Order Logic Undirected graphical model Markov Property: Probabilities for each node depend only on its neighbors Most probable assignment of random variables defined by cliques HARD to figure out all the clique potentials First Order Logic Data described by predicates with arguments Logical formulas denote inference rules over predicates Deterministic, ie. no probability EASY to specify logical formulas Markov Logic: Markov Networks + First Order Logic Statistical power of MLNs + specification power of FOL

Markov Networks Smoking Cancer Asthma Cough Smoking Cancer Ф(S,C) False 4.5 True 2.7

Markov Networks Smoking Cancer Asthma Cough Weight of Feature i

First Order Logic Fact A Fact B Fact A ^ Fact B Fact C

First Order Logic isA(Luke, Jedi) memberOf(Luke, Rebel Alliance)

First Order Logic isAlliedWith(Han, Luke) ^ memberOf(Luke, Rebel Alliance) memberOf(Han, Rebel Alliance)

First Order Logic isAlliedWith(X, Y) ^ memberOf(Y, Z) memberOf(X, Z)

Soft Rules vs. Hard Rules bornIn(X, Y) ^ locatedIn(Y, Z) bornIn(X, Z)

Soft Rules vs. Hard Rules bornIn(X, Y) ^ locatedIn(Y, Z) bornIn(X, Z) HARD

Soft Rules vs. Hard Rules bornIn(X, Y) ^ locatedIn(Y, Z) bornIn(X, Z) HARD friendOf(X, Y) ^ friendOf(Y, Z) friendOf(X, Z)

Soft Rules vs. Hard Rules bornIn(X, Y) ^ locatedIn(Y, Z) bornIn(X, Z) HARD friendOf(X, Y) ^ friendOf(Y, Z) friendOf(X, Z) SOFT

Soft Rules vs. Hard Rules bornIn(X, Y) ^ locatedIn(Y, Z) bornIn(X, Z) HARD friendOf(X, Y) ^ friendOf(Y, Z) friendOf(X, Z) SOFT MLN Basic Idea: Give every formula a weight (higher weight  stronger constraint)

Markov Logic Networks: Logic + Probability A Markov Logic Network is a collection of pairs (F, W) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5

Building a Markov Network Nodes = all possible instantiations of a predicate Edge between two nodes iff they appear together in some formula

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) memberOf(L, E) alliedWith(H, L)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) memberOf(L, E) alliedWith(H, L)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) memberOf(L, E) alliedWith(H, L) alliedWith(L, H)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) memberOf(L, E) alliedWith(H, L) alliedWith(L, H)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R) hasLightSaber(L) hasLightSaber(H)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R) hasLightSaber(L) hasLightSaber(H)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R) hasLightSaber(L) hasLightSaber(H)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R) hasLightSaber(L) hasLightSaber(H)

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) memberOf(L, R) memberOf(H, R) hasLightSaber(L) hasLightSaber(H)

Probability of Possible World

Probability of Possible World

Probability of Possible World Weight of formula i No. of true groundings of formula i in x

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) memberOf(L, R) memberOf(H, R) hasLightSaber(L) hasLightSaber(H)

Inference in MLNs MAP Inference: Find the most likely possible world given some evidence

Inference in MLNs MAP Inference: Find the most likely possible world given some evidence

Solved with MaxWalkSAT Inference in MLNs MAP Inference: Find the most likely possible world given some evidence Solved with MaxWalkSAT

Solved with MaxWalkSAT Inference in MLNs MAP Inference: Find the most likely possible world given some evidence Solved with MaxWalkSAT Marginal Inference: Find the marginal distribution of a predicate (node) given all the others

MaxWalkSAT for i ← 1 to max-tries do solution = random truth assignment for j ← 1 to max-flips do if ∑ weights(sat. clauses) > threshold then return solution c ← random unsatisfied clause with probability p flip a random variable in c else flip variable in c that maximizes ∑ weights(sat. clauses) return failure, best solution found

Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) memberOf(L, R) memberOf(H, R) hasLightSaber(L) hasLightSaber(H)

Fill in Observed Data Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) 1 memberOf(L, R) memberOf(H, R) 1 hasLightSaber(L) hasLightSaber(H) 1 Fill in Observed Data

Find Missing Data Han (H) Luke (L) Rebels (R) Empire (E) ? memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) ? ? 1 memberOf(L, R) memberOf(H, R) 1 hasLightSaber(L) hasLightSaber(H) 1 ? Find Missing Data

Initialization Assignment Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) 1 memberOf(L, R) memberOf(H, R) 1 hasLightSaber(L) hasLightSaber(H) 1 Initialization Assignment

Flip One At Random Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) 1 memberOf(L, R) memberOf(H, R) 1 hasLightSaber(L) hasLightSaber(H) 1 Flip One At Random

Flip One At Random Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) 1 1 memberOf(L, R) memberOf(H, R) 1 hasLightSaber(L) hasLightSaber(H) 1 Flip One At Random

Flip One At Random Han (H) Luke (L) Rebels (R) Empire (E) memberOf(X, Z) ^ memberOf(Y, Z) -> alliedWith(X, Y) 8.5 memberOf(H, E) alliedWith(X, Y) -> alliedWith(Y, X) 100 hasLightsaber(X) -> memberOf(X, Rebels) 0.5 memberOf(L, E) alliedWith(H, L) alliedWith(L, H) Knowledge Base: memberOf(Luke, Rebels) memberOf(Han, Rebels) hasLightSaber(Luke) ~memberOf(Luke, Empire) 1 1 1 memberOf(L, R) memberOf(H, R) 1 hasLightSaber(L) hasLightSaber(H) 1 1 Flip One At Random

Learning in MLNs Data is a relational database or knowledge base memberOf(Luke, Rebels) hasLightSaber(Luke) memberOf(Han, Rebels) ~memberOf(Luke, Empire)

Learning in MLNs Data is a relational database or knowledge base Learning parameters (weights) Similar to learning weights for Markov networks Knowledge Base: memberOf(Luke, Rebels) hasLightSaber(Luke) memberOf(Han, Rebels) ~memberOf(Luke, Empire)

Learning in MLNs Data is a relational database or knowledge base Learning parameters (weights) Similar to learning weights for Markov networks Learning structure (formulas) Inductive logic programming Knowledge Base: memberOf(Luke, Rebels) hasLightSaber(Luke) memberOf(Han, Rebels) ~memberOf(Luke, Empire)

Learning Weights via Max-Likelihood No. of times clause i is true in data Expected no. times clause i is true according to MLN

Alchemy (Open Source Software) Full first-order logic syntax Inference (MAP and marginal probabilities) Lifted Inference Weight Learning (generative and discriminative) Structure Learning Large set of tutorials for learning Information extraction, social network modeling, entity resolution alchemy.cs.washington.edu

Current Problem: Scale to Large KBs! KnowitAll (Univ. of Washington) 500M facts extracted from 100M webpages (ClueWeb) DBPedia (Univ. of Leipzig) 3.5M entities, 700M facts extracted from Wikipedia YAGO (Max-Planck Institute) 2M entities, 20M facts extracted from Wikipedia and WordNet Freebase (Google) 20M entities, 300M facts from integrated data sources and humans NELL (Carnegie Mellon) 850M facts extracted from 500M webpages A knowledge base is any collection of such facts, but what we’re really concerned with is the domain of large-scale KBs. These are created either through automated scraping of data or large crowdsourcing efforts.