URDF Query-Time Reasoning in Uncertain RDF Knowledge Bases Ndapandula Nakashole Mauro Sozio Fabian Suchanek Martin Theobald.

Slides:



Advertisements
Similar presentations
Interactive Reasoning in Large and Uncertain RDF Knowledge Bases Martin Theobald Joint work with: Maximilian Dylla, Timm Meiser, Ndapa Nakashole, Christina.
Advertisements

Research Internships Advanced Research and Modeling Research Group.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
UIUC CS 497: Section EA Lecture #2 Reasoning in Artificial Intelligence Professor: Eyal Amir Spring Semester 2004.
Proofs from SAT Solvers Yeting Ge ACSys NYU Nov
“Using Weighted MAX-SAT Engines to Solve MPE” -- by James D. Park Shuo (Olivia) Yang.
Methods of Proof Chapter 7, second half.. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 20
Methods of Proof Chapter 7, Part II. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound) generation.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.
Exact Inference in Bayes Nets
Properties of SLUR Formulae Ondřej Čepek, Petr Kučera, Václav Vlček Charles University in Prague SOFSEM 2012 January 23, 2012.
Approximate Counting via Correlation Decay Pinyan Lu Microsoft Research.
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
Outline Recap Knowledge Representation I Textbook: Chapters 6, 7, 9 and 10.
Proof methods Proof methods divide into (roughly) two kinds: –Application of inference rules Legitimate (sound) generation of new sentences from old Proof.
Semidefinite Programming
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
08/1 Foundations of AI 8. Satisfiability and Model Construction Davis-Putnam, Phase Transitions, GSAT Wolfram Burgard and Bernhard Nebel.
The Theory of NP-Completeness
Knowledge Representation I (Propositional Logic) CSE 473.
Methods of Proof Chapter 7, second half.
1 Discrete Structures CS 280 Example application of probability: MAX 3-SAT.
Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.
Propositional Logic Reasoning correctly computationally Chapter 7 or 8.
Cooperative Query Answering Based on a talk by Erick Martinez.
INFERENCE IN FIRST-ORDER LOGIC IES 503 ARTIFICIAL INTELLIGENCE İPEK SÜĞÜT.
1  Special Cases:  Query Semantics: (“Marginal Probabilities”)  Run query Q against each instance D i ; for each answer tuple t, sum up the probabilities.
SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes Jan 19, 2012.
CHAPTERS 7, 8 Oliver Schulte Logical Inference: Through Proof to Truth.
Theory of Computation, Feodor F. Dragan, Kent State University 1 NP-Completeness P: is the set of decision problems (or languages) that are solvable in.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE COS302 MICHAEL L. LITTMAN FALL 2001 Satisfiability.
Markov Logic And other SRL Approaches
1 Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and.
Logical Agents Logic Propositional Logic Summary
First-Order Logic and Plans Reading: C. 11 (Plans)
CPSC 322, Lecture 23Slide 1 Logic: TD as search, Datalog (variables) Computer Science cpsc322, Lecture 23 (Textbook Chpt 5.2 & some basic concepts from.
Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module Logic Representations.
Logical Agents Chapter 7. Knowledge bases Knowledge base (KB): set of sentences in a formal language Inference: deriving new sentences from the KB. E.g.:
LDK R Logics for Data and Knowledge Representation Propositional Logic: Reasoning First version by Alessandro Agostini and Fausto Giunchiglia Second version.
Logical Agents Chapter 7. Outline Knowledge-based agents Logic in general Propositional (Boolean) logic Equivalence, validity, satisfiability.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Oct, 30, 2015 Slide credit: some slides adapted from Stuart.
© Copyright 2008 STI INNSBRUCK Intelligent Systems Propositional Logic.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Inference in First Order Logic. Outline Reducing first order inference to propositional inference Unification Generalized Modus Ponens Forward and backward.
1 Propositional Logic Limits The expressive power of propositional logic is limited. The assumption is that everything can be expressed by simple facts.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
Logical Agents Chapter 7. Outline Knowledge-based agents Propositional (Boolean) logic Equivalence, validity, satisfiability Inference rules and theorem.
Inference in Propositional Logic (and Intro to SAT) CSE 473.
Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View Basic Concepts and Background.
Proof Methods for Propositional Logic CIS 391 – Intro to Artificial Intelligence.
Knowledge Repn. & Reasoning Lecture #9: Propositional Logic UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.
EA C461 Artificial Intelligence
Logic: TD as search, Datalog (variables)
An Introduction to Markov Logic Networks in Knowledge Bases
Inference in Propositional Logic (and Intro to SAT)
Recursive stack-based version of Back-chaining using Propositional Logic
EA C461 – Artificial Intelligence Logical Agent
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30
Logical Inference: Through Proof to Truth
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 20
Non-Standard-Datenbanken
Non-Standard-Datenbanken
Methods of Proof Chapter 7, second half.
Markov Networks.
Presentation transcript:

URDF Query-Time Reasoning in Uncertain RDF Knowledge Bases Ndapandula Nakashole Mauro Sozio Fabian Suchanek Martin Theobald

bornOn(Jeff, 09/22/42) gradFrom(Jeff, Columbia) hasAdvisor(Jeff, Arthur) hasAdvisor(Surajit, Jeff) knownFor(Jeff, Theory) type(Jeff, Author) [0.9] author(Jeff, Drag_Book) [0.8] author(Jeff,Cind_Book) [0.6] worksAt(Jeff, Bell_Labs) [0.7] type(Jeff, CEO) [0.4] Information Extraction YAGO/DBpedia et al. New fact candidates >120 M facts for YAGO2 (mostly from Wikipedia infoboxes) 100’s M additional facts from Wikipedia text

Outline  Motivation & Problem Setting  URDF running example: people graduating from universities  Efficient MAP Inference  MaxSAT solving with soft & hard constraints  Grounding  Deductive grounding of soft rules (SLD resolution)  Iterative grounding of hard rules (closure)  MaxSAT Algorithm  MaxSAT algorithm in 3 steps  Experiments & Future Work Query-Time Reasoning in Uncertain RDF Knowledge Bases 3

URDF: Uncertain RDF Data Model  Extensional Layer (information extraction & integration)  High-confidence facts: existing knowledge base (“ground truth”)  New fact candidates: extracted facts with confidence values  Integration of different knowledge sources: Ontology merging or explicit Linked Data (owl:sameAs, owl:equivProp.)  Large “Uncertain Database” of RDF facts  Intensional Layer (query-time inference)  Soft rules: deductive grounding & lineage (Datalog/SLD resolution)  Hard rules: consistency constraints (more general FOL rules)  Propositional & probabilistic consistency reasoning Query-Time Reasoning in Uncertain RDF Knowledge Bases 4

Soft Rules vs. Hard Rules (Soft) Deduction Rules vs. (Hard) Consistency Constraints PPeople may live in more than one place livesIn(x,y)  marriedTo(x,z)  livesIn(z,y) livesIn(x,y)  hasChild(x,z)  livesIn(z,y) PPeople are not born in different places/on different dates bornIn(x,y)  bornIn(x,z)  y=z PPeople are not married to more than one person (at the same time, in most countries?) marriedTo(x,y,t 1 )  marriedTo(x,z,t 2 )  y≠z  disjoint(t 1,t 2 ) [0.8] [0.5] Query-Time Reasoning in Uncertain RDF Knowledge Bases 5

Soft Rules vs. Hard Rules (Soft) Deduction Rules vs. (Hard) Consistency Constraints  People may live in more than one place livesIn(x,y)  marriedTo(x,z)  livesIn(z,y) livesIn(x,y)  hasChild(x,z)  livesIn(z,y)  People are not born in different places/on different dates bornIn(x,y)  bornIn(x,z)  y=z  People are not married to more than one person (at the same time, in most countries?) marriedTo(x,y,t 1 )  marriedTo(x,z,t 2 )  y≠z  disjoint(t 1,t 2 ) [0.8] [0.5] Query-Time Reasoning in Uncertain RDF Knowledge Bases 6 Rule-based (deductive) reasoning: Datalog, RDF/S, OWL2-RL, etc. FOL constraints (in particular mutex): Datalog with constraints, X-tuples in Prob. DB’s owl:FunctionalProperty, etc. FOL constraints (in particular mutex): Datalog with constraints, X-tuples in Prob. DB’s owl:FunctionalProperty, etc.

URDF Running Example Jeff Stanford University type [1.0] Surajit Princeton David Computer Scientist Computer Scientist worksAt [0.9] type [1.0] graduatedFrom [0.6] graduatedFrom [0.7] graduatedFrom [0.9] hasAdvisor [0.8] hasAdvisor [0.7] KB: RDF Base Facts Derived Facts gradFrom(Surajit,Stanford) gradFrom(David,Stanford) Derived Facts gradFrom(Surajit,Stanford) gradFrom(David,Stanford) graduatedFrom [?] First-Order Rules hasAdvisor(x,y)  worksAt(y,z)  graduatedFrom(x,z) [0.4] graduatedFrom(x,y)  graduatedFrom(x,z)  y=z First-Order Rules hasAdvisor(x,y)  worksAt(y,z)  graduatedFrom(x,z) [0.4] graduatedFrom(x,y)  graduatedFrom(x,z)  y=z Query-Time Reasoning in Uncertain RDF Knowledge Bases 7

Basic Types of Inference  Maximum-A-Posteriori (MAP) Inference  Find the most likely assignment to query variables y under a given evidence x.  Compute: arg max y P( y | x) (NP-hard for propositional formulas, e.g., MaxSAT over CNFs)  Marginal/Success Probabilities  Probability that query y is true in a random world under a given evidence x.  Compute: ∑ y P( y | x ) (#P-hard for propositional formulas) Query-Time Reasoning in Uncertain RDF Knowledge Bases 8

9 General Route: Grounding & MaxSAT Solving Query graduatedFrom(x, y) Query graduatedFrom(x, y) CNF (graduatedFrom(Surajit, Stanford)  graduatedFrom(Surajit, Princeton))  (graduatedFrom(David, Stanford)  graduatedFrom(David, Princeton))  (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(Surajit, Stanford))  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(David, Stanford))  worksAt(Jeff, Stanford)  hasAdvisor(Surajit, Jeff)  hasAdvisor(David, Jeff)  graduatedFrom(Surajit, Princeton)  graduatedFrom(Surajit, Stanford)  graduatedFrom(David, Princeton) CNF (graduatedFrom(Surajit, Stanford)  graduatedFrom(Surajit, Princeton))  (graduatedFrom(David, Stanford)  graduatedFrom(David, Princeton))  (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(Surajit, Stanford))  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(David, Stanford))  worksAt(Jeff, Stanford)  hasAdvisor(Surajit, Jeff)  hasAdvisor(David, Jeff)  graduatedFrom(Surajit, Princeton)  graduatedFrom(Surajit, Stanford)  graduatedFrom(David, Princeton) ) Grounding – Consider only facts (and rules) which are relevant for answering the query 2) Propositional formula in CNF, consisting of – Grounded hard & soft rules – Uncertain base facts 3) Propositional Reasoning – Find truth assignment to facts such that the total weight of the satisfied clauses is maximized  MAP inference: compute “most likely” possible world

Why are high weights for hard rules not enough?  Consider the following CNF (for A,B > 0, A >> B)  The optimal solution has weight A+B  The next-best solution has weight A+0  Hence the ratio of the optimal over the approximate solution is A+B / A  In general, any (1+  ) approximation algorithm, with  > 0, may set graduatedFrom(Surajit, Princeton) to true, as A+B / A  1 for A  . Query-Time Reasoning in Uncertain RDF Knowledge Bases 10 CNF (graduatedFrom(Surajit, Stanford)  graduatedFrom(Surajit, Princeton))  graduatedFrom(Surajit, Princeton)  graduatedFrom(Surajit, Stanford) CNF (graduatedFrom(Surajit, Stanford)  graduatedFrom(Surajit, Princeton))  graduatedFrom(Surajit, Princeton)  graduatedFrom(Surajit, Stanford) A0BA0B

 Find: arg max y P( y | x)  Resolves to a variant of MaxSAT for propositional formulas URDF: MaxSAT Solving with Soft & Hard Rules Query-Time Reasoning in Uncertain RDF Knowledge Bases { graduatedFrom(Surajit, Stanford), graduatedFrom(Surajit, Princeton) } { graduatedFrom(David, Stanford), graduatedFrom(David, Princeton) } { graduatedFrom(Surajit, Stanford), graduatedFrom(Surajit, Princeton) } { graduatedFrom(David, Stanford), graduatedFrom(David, Princeton) } (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(Surajit, Stanford))  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(David, Stanford))  worksAt(Jeff, Stanford)  hasAdvisor(Surajit, Jeff)  hasAdvisor(David, Jeff)  graduatedFrom(Surajit, Princeton)  graduatedFrom(Surajit, Stanford)  graduatedFrom(David, Princeton) (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(Surajit, Stanford))  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  graduatedFrom(David, Stanford))  worksAt(Jeff, Stanford)  hasAdvisor(Surajit, Jeff)  hasAdvisor(David, Jeff)  graduatedFrom(Surajit, Princeton)  graduatedFrom(Surajit, Stanford)  graduatedFrom(David, Princeton) S: Mutex-const. Special case: Horn-clauses as soft rules & mutex-constraints as hard rules C: Weighted Horn clauses (CNF) Compute W 0 = ∑ clauses C w(C) P(C is satisfied); For each hard constraint S { For each fact f in S t { Compute W f+ t = ∑ clauses C w(C) P(C is sat. | f = true); } Compute W S- t = ∑ clauses C w(C) P(C is sat. | S t = false); Choose truth assignment to f in S t that maximizes W f+ t, W S- t ; Remove satisfied clauses C; t++; } Compute W 0 = ∑ clauses C w(C) P(C is satisfied); For each hard constraint S { For each fact f in S t { Compute W f+ t = ∑ clauses C w(C) P(C is sat. | f = true); } Compute W S- t = ∑ clauses C w(C) P(C is sat. | S t = false); Choose truth assignment to f in S t that maximizes W f+ t, W S- t ; Remove satisfied clauses C; t++; } Runtime: O(|S||C|) Approximation guarantee of 1/2 Runtime: O(|S||C|) Approximation guarantee of 1/2 11 MaxSAT Alg.

Deductive Grounding Algorithm (SLD Resolution/Datalog)   /\ graduatedFrom (Surajit, Princeton) graduatedFrom (Surajit, Princeton) hasAdvisor (Surajit,Jeff ) hasAdvisor (Surajit,Jeff ) worksAt (Jeff,Stanford ) worksAt (Jeff,Stanford ) graduatedFrom (Surajit, Stanford) graduatedFrom (Surajit, Stanford) Query graduatedFrom(Surajit, y) Query graduatedFrom(Surajit, y) First-Order Rules hasAdvisor(x,y)  worksAt(y,z)  graduatedFrom(x,z) [0.4] graduatedFrom(x,y)  graduatedFrom(x,z)  y=z First-Order Rules hasAdvisor(x,y)  worksAt(y,z)  graduatedFrom(x,z) [0.4] graduatedFrom(x,y)  graduatedFrom(x,z)  y=z Base Facts graduatedFrom(Surajit, Princeton) [0.7] graduatedFrom(Surajit, Stanford) [0.6] graduatedFrom(David, Princeton) [0.9] hasAdvisor(Surajit, Jeff) [0.8] hasAdvisor(David, Jeff) [0.7] worksAt(Jeff, Stanford) [0.9] type(Princeton, University) [1.0] type(Stanford, University) [1.0] type(Jeff, Computer_Scientist) [1.0] type(Surajit, Computer_Scientist) [1.0] type(David, Computer_Scientist) [1.0] Base Facts graduatedFrom(Surajit, Princeton) [0.7] graduatedFrom(Surajit, Stanford) [0.6] graduatedFrom(David, Princeton) [0.9] hasAdvisor(Surajit, Jeff) [0.8] hasAdvisor(David, Jeff) [0.7] worksAt(Jeff, Stanford) [0.9] type(Princeton, University) [1.0] type(Stanford, University) [1.0] type(Jeff, Computer_Scientist) [1.0] type(Surajit, Computer_Scientist) [1.0] type(David, Computer_Scientist) [1.0] Query-Time Reasoning in Uncertain RDF Knowledge Bases 12 Grounded Rules hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford) gradFrom(Surajit, Stanford)  gradFrom(Surajit, Princeton) Grounded Rules hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford) gradFrom(Surajit, Stanford)  gradFrom(Surajit, Princeton)

Dependency Graph of a Query  SLD grounding always starts from a query literal and first pursues over the soft deduction rules.  Grounding is also iterated over the hard rules in a top- down fashion by using the literals in each hard rule as new subqueries.  Cycles (due to recursive rules) are detected and resolved via a form of tabling known from Datalog.  Grounding terminates when a closure is reached, i.e., when no new facts can be grounded from the rules and all subgoals are either resolved or form the root of a cycle. Query-Time Reasoning in Uncertain RDF Knowledge Bases 13

Weighted MaxSAT Algorithm General idea Compute a potential function W t that iterates over all hard rules S t and set the fact f  S t that maximizes W t (or none of them) to true; set all other facts in S t to false. Query-Time Reasoning in Uncertain RDF Knowledge Bases 14  At iteration 0, we have  At any intermediate iteration t, we compare  At the final iteration t_max, all facts are assigned either true or false.  W t_max is equal to the total weight of all clauses that are satisfied.

Step 1  Weights w(f i ) and probabilities p i Query-Time Reasoning in Uncertain RDF Knowledge Bases 15 { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9 (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9 S: Mutex-const. C: Weighted Horn clauses (CNF)

Query-Time Reasoning in Uncertain RDF Knowledge Bases 16 Step 2 { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } S: Mutex-const. C: Weighted Horn clauses (CNF)  Weights w(f i ) and probabilities p i (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9 (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9

(hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9 (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9  Weights w(f i ) and probabilities p i Query-Time Reasoning in Uncertain RDF Knowledge Bases 17 Step 2 { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } S: Mutex-const. C: Weighted Horn clauses (CNF) C 1 : hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford) P(C 1 ) = 1 – (1-(1-1))(1-(1-1))(1-1) = 1 C1:C1: hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford) P(C 1 ) = 1 – (1-(1-1))(1-(1-1))(1-1) = 1 single partition, negated: 1 - p i single partition, positive: p i

Query-Time Reasoning in Uncertain RDF Knowledge Bases 18 Step 2 { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } S: Mutex-const. C: Weighted Horn clauses (CNF)  Weights w(f i ) and probabilities p i P(C 1 is satisfied) = 1-(1-(1-1))(1-(1-1))(1-1) = 1 P(C 2 is satisfied) = 1-(1-(1-1))(1-(1-1))(1-0) = 0...  W 0 = = 5.0 (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9 (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9

(hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9 (hasAdvisor(Surajit, Jeff)  worksAt(Jeff, Stanford)  gradFrom(Surajit, Stanford)) 0.4  (hasAdvisor(David, Jeff)  worksAt(Jeff, Stanford)  gradFrom(David, Stanford)) 0.4  worksAt(Jeff, Stanford) 0.9  hasAdvisor(Surajit, Jeff) 0.8  hasAdvisor(David, Jeff) 0.7  gradFrom(Surajit, Princeton) 0.6  gradFrom(Surajit, Stanford) 0.7  gradFrom(David, Princeton) 0.9 Query-Time Reasoning in Uncertain RDF Knowledge Bases 19 Step 3 { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } { gradFrom(Surajit, Stanford), gradFrom(Surajit, Princeton) } { gradFrom(David, Stanford), gradFrom(David, Princeton) } S: Mutex-const. C: Weighted Horn clauses (CNF)  W 1 = = 4.8  W 2 = = 4.4

Experiments – Setup  YAGO Knowledge Base  2 Mio entities, 20 Mio facts  Soft Rules  16 soft rules (hand-crafted deduction rules with weights)  Hard Rules  5 predicates with functional properties (bornIn, diedIn, bornOnDate, diedOnDate, marriedTo)  Queries  10 conjunctive SPARQL queries  Markov Logic as Competitor (based on MCMC)  MAP inference: Alchemy employs a form of MaxWalkSAT  MC-SAT: Iterative MaxSAT & Gibbs sampling Query-Time Reasoning in Uncertain RDF Knowledge Bases 20

YAGO Knowledge Base: URDF vs. Markov Logic URDF: SLD grounding & MaxSat solving |C| - # ground literals in soft rules |S| - # ground literals in hard rules URDF vs. Markov Logic (MAP inference & MC-SAT) First run: ground each query against the rules (SLD grounding + MaxSAT solving) & report sum of runtimes Asymptotic runtime checks: synthetic soft rule expansions Query-Time Reasoning in Uncertain RDF Knowledge Bases 21

Recursive Rules & LUBM Benchmark  42 inductively learned (partly recursive) rules over 20 Mio facts in YAGO  URDF grounding with different maximum SLD levels Query-Time Reasoning in Uncertain RDF Knowledge Bases 22  URDF (SLD grounding + MaxSAT) vs. Jena (only grounding) over the LUBM benchmark  SF-1: 103,397 triplets  SF-5: 646,128 triplets  SF-10: 1,316,993 triplets

Current & Future Topics...  Temporal consistency reasoning  Soft/hard rules with temporal predicates  Soft deduction rules: deduce confidence distribution of derived facts  Learning soft rules & consistency constraints  Explore how Inductive Logic Programming can be applied to large, uncertain & incomplete knowledge bases  More solving/sampling  Linear-time constrained & weighted MaxSAT solver  Improved Gibbs sampling with soft & hard rules  Scale-out  Distributed grounding via message passing  Updates/versioning for (linked) RDF data  Non-monotonic answers for rules with negation! Query-Time Reasoning in Uncertain RDF Knowledge Bases 23

Online Demo! urdf.mpi-inf.mpg.de Query-Time Reasoning in Uncertain RDF Knowledge Bases 24