Download presentation
Presentation is loading. Please wait.
Published byKatherine Reynolds Modified over 9 years ago
1
ULDBs: Databases with Uncertainty and Lineage O. Benjelloun, A. Das Sarma, A. Halevy, J. Widom
2
Running Example: Crime- Solving Saw(witness,car) // may be uncertain Drives(person,car) // may be uncertain Suspects(person) = π person (Saw ⋈ Drives)
3
Model for Uncertainty
4
1.X-Tuples –more expressive than or-attributes 2. ‘?’ (Maybe) Annotations
5
Our Model for Uncertainty 1. X-Tuples: uncertainty about value 2. ‘?’ (Maybe) Annotations Saw (witness,car) (Amy, Honda) ∥ (Amy, Toyota) ∥ (Amy, Mazda) witnesscar Amy{ Honda, Toyota, Mazda } = Three possible instances
6
Our Model for Uncertainty 1. X-Tuples: uncertainty about value 2. ‘?’ (Maybe) Annotations Saw (witness,car) (Amy, Honda) ∥ (Sally, Toyota) ∥ (Amy, Mazda) Three possible instances Not expressible using or-attributes
7
Six possible instances Our Model for Uncertainty 1. X-Tuples 2. ‘?’ (Maybe): uncertainty about presence Saw (witness,car) (Amy, Honda) ∥ (Amy, Toyota) ∥ (Amy, Mazda) (Betty, Acura) ?
8
Our Model is Not Closed Saw (witness,car) (Cathy, Honda) ∥ (Cathy, Mazda) Drives (person,car) (Jimmy, Toyota) ∥ (Jimmy, Mazda) (Billy, Honda) ∥ (Frank, Honda) (Hank, Honda) Suspects Jimmy Billy ∥ Frank Hank Suspects = π person (Saw ⋈ Drives) ? ? ? Does not correctly capture possible instances in the result CANNOT
9
Lineage
10
Lineage to the Rescue Lineage –Captures “where data came from” –In Trio: A function λ from alternatives to other alternatives (or external sources) Model, with lineage, is complete –proof omitted
11
Example with Lineage IDSaw (witness,car) 11 (Cathy, Honda) ∥ (Cathy, Mazda) IDDrives (person,car) 21 (Jimmy, Toyota) ∥ (Jimmy, Mazda) 22 (Billy, Honda) ∥ (Frank, Honda) 23(Hank, Honda) IDSuspects 31Jimmy 32 Billy ∥ Frank 33Hank ? ? ? Suspects = π person (Saw ⋈ Drives) λ (31) = (11,2),(21,2) λ (32,1) = (11,1),(22,1); λ (32,2) = (11,1),(22,2) λ (33) = (11,1), 23 Correctly captures possible instances in the result
12
Example: What is the result of joining these tables? IDSaw(Witness, Car) 21(Amy, Mazda)||(Amy, Toyota) 23(Betty, Honda) ? IDDrives(Person, Car) 31(Jimmy, Mazda) 32(Jimmy, Toyota) 33(Billy, Mazda) 34(Billy, Honda)
13
What is a legal instance of a ULDB? Each tuple t in a ULDB is associated by with a set of pairs (i,j) such that the j-th alternative of the i-th tuple was used to derive i IDSuspects 31Jimmy 32 Billy ∥ Frank 33Hank ? ? ? λ (31) = (11,2),(21,2) λ (32,1) = (11,1),(22,1); λ (32,2) = (11,1),(22,2) λ (33) = (11,1), 23
14
What is a legal instance of a ULDB? Let S be the set of all symbols (i.e., pairs (i,j)) in the database An instance of D is derived by picking a set S’µ S such that –if (i,j)2 S’ then for every j j’, (i,j’) S’ – 8 (i,j) 2 S’, (i,j)µ S’ –if, for some X-tuple t i, there does not exist a (i,j)2 S’, then t i is a maybe-tuple and for all (i,j’)2 t i, either (i,j) = ; or (i,j)* S’
15
Example: What are all legal instances of the following ULDB? ? (41,1) = {(21,1),(31,1)} IDAccuses(Witness, Person) 41(Amy, Jimmy) 42(Amy, Jimmy) 43(Amy, Billy) 44(Betty, Billy) ? (42,1) = {(21,2),(32,1)} ? (41,1) = {(21,1),(33,1)} ? (41,1) = {(23,1),(34,1)} IDSaw(Witness, Car) 21(Amy, Mazda)||(Amy, Toyota) 23(Betty, Honda) ? IDDrives(Person, Car) 31(Jimmy, Mazda) 32(Jimmy, Toyota) 33(Billy, Mazda) 34(Billy, Honda)
16
Well-Behaved Lineage In principle, may be any function – * is the transitive closure of However, useful to restrict to be well behaved: –Acyclic: 8 (i,j), (i,j) * (i,j) –Deterministic: 8 (i,j), (i,j’), if j j’ then either (i,j) (i,j’) or (i,j)=; –Uniform: 8 (i,j),(i,j’), B(i,j)=B(i,j’) where B(i,j) = {k | 9 l, (k,l)2 (i,j)}
17
Example: Is this ULDB Well- Behaved? IDA 11apple 12pear IDB 21red 22green (11,1) = {(21,1)} (21,1) = {(11,1)}
18
Example: Is this ULDB Well- Behaved? IDA 11apple 12pear IDB 21red || green 22green (21,1) = {(11,1)} (21,2) = {(11,1)}
19
Example: Is this ULDB Well- Behaved? IDA 11apple || peach 12pear || grape IDB 21red || pink 22green || purple (21,1) = {(11,1)} (21,2) = {(11,2)} (22,1) = {(12,1)} (21,2) = {(11,2)}
20
Querying
21
Querying How do we query a ULDB? What tuples are in the answer? How is the lineage of the answer defined? –for join? –projection? –minus? Only consider projection, multi-set selection, join, multiset union –why?
22
Query Evaluation Algorithm Given, ULDB D and query Q Step 1: Create D’, an ordinary database derived by taking all alternatives of all tuples IDSaw(Witness, Car) 21(Amy, Mazda)||(Amy, Toyota) 23(Betty, Honda) IDSaw(Witness, Car) 21, 1(Amy, Mazda) 21, 2(Amy, Toyota) 23, 1(Betty, Honda)
23
Query Evaluation Algorithm Step 2: Evaluate the query normally IDSaw(Witness, Car) 21, 1(Amy, Mazda) 21, 2(Amy, Toyota) 23, 1(Betty, Honda) IDAccuses(Witness, Person) 41(Amy, Jimmy) 42(Amy, Jimmy) 43(Amy, Billy) 44(Betty, Billy) BC
24
Query Evaluation Algorithm Step 3: Group tuples in result by the tuple identifiers (the i value) corresponding to their lineage by the evaluation Step 4: For each group of tuple identifiers –create a maybe tuple t l with all tuples in group as alternatives –set lineage as derived by the evaluation Note: all tuples created are maybe-tuples!!
25
Examples Complete example from previous slides Compute the result of the query: –(R(A,B) BC S(B,C)) [ T(D,E) IDR(A,B) 11(1,2) || (1,3) 12(4,1) || (5,1) IDS(B,C) 11(2,4) || (2,5) 12(1,3) || (2,3) IDT(D,E) 11(7,8) 12(9,10) || (9,11)
26
Minimality
27
Minimality ULDBs may contain superfluous information Two types of minimality: –data minimality: ? may be unneeded, entire tuple may be unneeded –lineage minimality
28
Data Minimality: Example 1 IDSaw(Witness, Car) 21(Amy, Mazda)||(Amy, Toyota) 23(Betty, Honda) ? IDDrives(Person, Car) 31(Jimmy, Mazda) 32(Jimmy, Toyota) 33(Billy, Mazda) 34(Billy, Honda) IDSuspects 31Jimmy 32 Billy ∥ Frank 33Hank ? ? ? λ (31) = (11,2),(21,2) λ (32,1) = (11,1),(22,1); λ (32,2) = (11,1),(22,2) λ (33) = (11,1), 23 Which ? is not needed?
29
Data Minimality: Example 2 What is unneeded in the result of the following query: –(SawBC Car1) BC witness (SawBC Car2) IDSaw(Witness, Car) 1(Amy, Mazda)||(Amy, Toyota) IDCar1(Car) 2Mazda IDCar2(Car) 3Toyota
30
Data-Minimality: Formally An alternative (i,j) is extraneous if removing it from the relation does not change the set of possible instances A ? on a tuple is extraneous if removing it does not change the set of possible instances
31
Checking for Data-Minimality Theorem: Let D be a well-behaved ULDB. An alternative (k,l) is extraneous if and only if there exist (i,j), (i,j’)2 (k,l) with j j’ –Proof?
32
Checking for Data-Minimality Let h(t) be the set of base tuples of t –tuples that are used to derive an alternative in t, which have empty lineage Let m(t) be the number of alternative of t that are not extraneous Theorem: Let D be a well-behaved ULDB. A ? on an x-tuple t2 D is extraneous if and only if: –none of the tuples in h(t) have a ? –m(t) = t ’ 2 h(t) m(t’)
33
Test Yourself Go back to slides 28-29 and prove what is extraneous, using the characteristics
34
Tuple Membership Problems
35
Tuple Membership, Tuple Certainty Recall that: –The tuple membership problem is to determine if a tuple is a member in some instance of the ULDB –The tuple certainty problem is to determine if a tuple is a member in some instance of the ULDB How would you answer tuple membership? Tuple certainty? What is the complexity of these problems?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.