Lecture 16: Probabilistic Databases Slides by Gerome Miklau Based on a tutorial by Dan Suciu
Today’s Agenda Motivation Probabilistic Data Semantics Representation Systems Complexity
Section 1 1. Motivation
Motivating Applications Section 1 Motivating Applications Text extraction & record linkage Inconsistent data Ranking query answers
Section 1 Text extraction
Section 1 Record Linkage
Section 1 Inconsistent Data Goal: consistent query answers from inconsistent databases Applications: Integration of autonomous data sources Un-enforced integrity constraints Temporary inconsistencies
Section 1 Repair semantics
Alternative probabilistic approach Section 1 Alternative probabilistic approach
Ranking query answers Database is deterministic Section 1 Ranking query answers Database is deterministic Query answers are uncertain: Query terms loosened due to user’s lack of understanding of the data or schema The query returns a ranked list of tuples; user interested in top-k
Summary: motivating applications Section 1 Summary: motivating applications
2. Probabilistic Data Semantics Section 2 2. Probabilistic Data Semantics
Possible worlds semantics Section 2 Possible worlds semantics
Section 2 The definition
Section 2 Example
Section 2 Tuples as Events
Section 2 Tuple correlation
Section 2 Example
Section 2 Query semantics
Section 2 Query semantics
Example: Query Semantics Section 2 Example: Query Semantics
Section 2 Query semantics
3. Representation Systems Section 3 3. Representation Systems
Representation systems Section 3 Representation systems
Representation systems Section 3 Representation systems
Tuple independent probabilistic database Section 3 Tuple independent probabilistic database
Tuple Prob. -> Possible Worlds Section 3 Tuple Prob. -> Possible Worlds
Tuple Prob. -> Query evaluation Section 3 Tuple Prob. -> Query evaluation
Tuple-independent distributions Section 3 Tuple-independent distributions
Section 3 Intensional database
Intensional DB => Possible Worlds Section 3 Intensional DB => Possible Worlds
Possible Worlds => Intensional DB Section 3 Possible Worlds => Intensional DB
Closure under operators Section 3 Closure under operators
Summary of Intensional Databases Section 3 Summary of Intensional Databases
Section 4 4. Complexity
Probability of boolean expressions Section 4 Probability of boolean expressions
Section 4 Example
Complexity of Boolean Expression Probability Section 4 Complexity of Boolean Expression Probability
Section 4 Query complexity
Intensional query evaluation Section 4 Intensional query evaluation
Extensional query evaluation Section 4 Extensional query evaluation
Section 4
Section 4 Query complexity
Summary on query complexity Section 4 Summary on query complexity