Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dealing with Uniqueness Constraint in Query Optimization

Similar presentations


Presentation on theme: "Dealing with Uniqueness Constraint in Query Optimization"— Presentation transcript:

1 Dealing with Uniqueness Constraint in Query Optimization
Class Presentation Shouzheng Yang

2 Introduction Where does uniqueness constraint exist: DISTINCT
Set operations such as What are the solutions? Sort Under what condition is DISTINCT clause unnecessary? And how do we identify such situations. What can we make use of to determine whether duplicate elimination is unnecessary? 2

3 Exploiting Uniqueness in Query Optimization
1. Class of SQL queries considered With only algebraic operators selection, projection, and extended Cartesian product. And we assume WHERE clause only consists of two tables. WHERE clause can be

4 Exploiting Uniqueness in Query Optimization
2. Constraint (Check, Unique Specification) Create Table SUPPLIER ( SNO …, SNAME…, SCITY…, BUDGET…, STATUS…, PRIMARY KEY(SNO), CHECK (SNO BETWEEN 1 AND 499), CHECK (SCITY IN (‘Chicago’, ‘New York’, ‘Toronto’)), CHECK(BUDGET<>0 OR STATUS=‘Inactive”))

5 Some notations and expressions:
the extended Cartesian product select all rows of R that satisfy condition C projection. d is either Dist or All A  b (corresponding attributes in A and b must either agree in value, or both be NULL.) 5

6 Exploiting Uniqueness in Query Optimization
Table for following examples: SUPPLIER(SNO, SNAME, SCITY,BUDGET,STATUS) PARTS(SNO, PNO, PNAME, OEM-PNO, COLOR) AGENTS(SNO, ANO, ANAME, ACITY). Tuples in PARTS reference the SUPPLIER who supply them. Tuples in AGENTS reference the SUPPLIER they represent. OEM-PNO is a candidate key

7 Exploiting Uniqueness in Query Optimization
Example: Select all S.SNO, SNAME, P.PNO, PNAME for a given supplier-no who supply that part. SELECT ALL S.SNO, SNAME, P.PNO, PNAME FROM SUPPLIER S, PARTS P WHERE P.SNO=:SUPPLIER-NO AND S.SNO=P.SNO By observation: We claim that PNO is a key of the derived table. Guess the theorem!?

8 Exploiting Uniqueness in Query Optimization
Intuitively: SELECT A FROM R, S WHERE The uniqueness condition will be met if: Either all the columns of Key (R x S) are in the project list, or A subset of key columns is present in the projection list, and the values of the other key columns are equated to constants or can be inferred through the selection predicate or table constraints. 8

9 Uniqueness Condition Formally, theorem (Uniqueness Condition):
SELECT A FROM R, S WHERE Two expressions Q and V are equivalent if and only if : 9

10 Uniqueness Condition 10

11 Proof: Sufficiency If theorem 1 holds but Q≠V
There exist (at least) two tuples r0, r0’ ∈ (I(R) ×I(S)) such that r0[A]=r0’[A] (Two tuples are same after projection in Q) Back to base tables, r0 and r0’ are derived from the tuples r0 [α(R)], r0’[α(R)], r0 [α(S)], r0’[α(S)], either r0 [α(R)] ≠ r0’[α(R)] or r0 [α(S)] ≠ r0’[α(S)]. However, the violates theorem where, r0 [Key (R×S)]=r0’[Key (R×S)]

12 Proof: Necessity Q contains no duplicates but theorem 1 does not hold.
There exist two tuples r0, r0’ ∈Domain (R × S) such that following is false but table constraints, key dependencies and query predicates are true. There must exist at least one column β∈ Key (R) o Key (S) where r0[β] ≠ r0’ [β] . Suppose β∈ Key (R). Because r0[A]=r0’[A], V contains a single tuple. Back to base table, r0 [α(R)] ≠ r0’[α(R)] , so the extended Cartesian product with r0 [α(S)] = r0’[α(S)] yields at least two tuples. Then Q ≠ V

13 Example SELECT ALL S.SNO, SNAME, P.PNO, PNAME FROM SUPPLER S, PARTS P
WHERE P.SNO=:SUPPLER-NO AND .SNO=P.SNO

14 Example

15 Example Finally if these can functionally determine this:

16 Example In a simple word: Since :

17 Algorithm

18 Algorithm

19 Example SELECT DISTINCT S.SNO, S.SNAME, P.PNO, P.PNAME
FROM SUPPLIER S, PARTS P WHERE P.SNO=:SUPPLIER-NO AND S.SNO=P.SNO

20 Algorithm Line 8: C  P.SNO=:SUPPLIER-NO ∧ S.SNO=P.SNO ∧ T Line 9-17:
C is unchanged Line 18: C is not simply true; we proceed

21 Algorithm Line 21: E_1  P.SNO=: SUPPLIER-NO ∧ S.SNO=P.SNO Line 24:
V={S.SNO, SNAME, P.PNO. PNAME} Line 26: V={S.SNO, SNAME, P.PNO, PNAME, P.SNO} Line 29: V is unchanged.

22 Corollary πAll [AR] (σ [CR] (R)) contains no duplicate rows. 1 2 3
Q = V1 Q = V2 iff Q = V1 AND πAll [AR] (σ [CR] (R)) contains no duplicate rows.

23 Example for Corollary SELECT ALL S.SNO, S.SNAME FROM SUPPLIER S
WHERE S.SNAME=:SUPPLIER-NAME AND EXISTS ( SELECT * FROM PARTS P WHERE S.SNO=P.SNO AND P.PNO=:PART-NO) SELECT ALL S.SNO, SNAME FROM SUPPLIER S, PARTS P WHERE S.SNAME=:SUPPLIER-NAME AND S.SNO=P.SNO AND P.PNO=:PART-NO SELECT DISTINCT S.SNO, SNAME

24 Reasoning about Duplicate Elimination with Description Logic
Tables: PROF < PERSON, (In: DEPT) COURSE < (Num: INT), (Room: INT), (Time: INT), (Inst: PROF), (In: DEPT), COURSE (Num, In Id), COURSE (Room, Time Id), COURSE (Inst.In In) PERSON < (Name: STRING) PERSON (Name Id) STUDENT < PERSON TAKES < (S: STUDENT), (C: COURSE), (MARK: INT), TAKES (S,C Id), TAKES (S, C.Time C) DEPT < (Name: STRING), (Head: PROF), DEPT (Name Id) DEPT (Head  Id)

25 Example for objective Find the unique names of students taking some course taught at the same time as some other course numbered: P2 with an instructor in the department named: P1. Select distinct S.SNAME as NAME From STUDENT as S, TAKES as T, COURSE as C Where S=T.S and C.Time=T.C.Time and C.Inst.In.Name =:P1 and C.Num=:P2 Select S.SNAME as NAME

26 Lemma Normal Form Select V From C1 as A1, … Cm as Am, (Elim Select W
Let Q be an arbitrary object relational conjunctive query. Then there is an equivalent query with the form. Select V From C1 as A1, … Cm as Am, (Elim Select W from Cm+1 as Am+1, … , Cn as An, R) Where R is a list of equalities of the form Ai.Pf1=Aj.Pf2 , basically, to represent the join part. V and W are sets of attribute names.

27 Theorem (Duplicate Elimination Reduction)
Q : select V from C1 as A1,…, Cm as Am, (elim select W from Cm+1 as Am+1 ,…, Cn as An , R) is semantically equivalent to Q’: select V from C1 as A1 ,…, Cm+1 as Am+1 , (elim select W ∪ {Am+1} from Cm+2 as Am+2,…, Cn as An , R) if and only if A ∪ W  Am+1 where A = {A1 ,…, Am}.

28 Proof Necessity: If Q=Q’, but A∪ W  Am+1 does not hold. Sufficiency:
There must exist v, w that Tuple Ai (v)=Tuple Ai (w) for Ai∈ { A ∪ W }, but FAm+1(v)≠ FAm+1(w). Then a single tuple FAi(v) for Q but At least the two tuples (FAi(v), Am+1 ): FAm+1(v) (FAi(w), Am+1 ): FAm+1(w) for Q’ This fact is preserved by the final duplicate-preserving projection and thus we have Q≠Q’ Sufficiency: Similar idea find two tuples differ on Am+1 to yield contradiction that A∪ W  Am+1 is false.

29 Example First, the original query is transformed to this lemma normal
select :P1, :P2, Name from elim ( from STUDENT as S, TAKES as T, COURSE as C, S = T.S, C.Time = T.C.Time, C.Inst.In.Name = :P1, C.Num = :P2, Name = S.Name ) Note that parameters :P1 and :P2 are converted to “result” attributes on the normalized query.

30 Example Now we use our rewriting rule from the Theorem to move
the reference to COURSE out of the scope of the elim operator. The necessary condition is (A ∪ W  Am+1). That is (Name, :P1, :P2  C ). Select :P1, :P2, Name From COURSE as C, elim ( select C, :P1, :P2, Name from STUDENT as S, TAKES as T S = T.S, C.Time = T.C.Time, C.Inst.In.Name = :P1, C.Num = :P2, Name = S.Name )

31 Example A sequence of two additional applications of the duplicate
elimination rewrite (then check for (Name, :P1, :P2,CT) to move the TAKES as T and (Name, :P1, :P2, C, T  S) for STUDENT as S) results in the following query: select :P1, :P2, Name from COURSE as C, TAKES as T, STUDENT as S, elim ( select C, T, S, :P1, :P2, Name from S = T.S, C.Time = T.C.Time, C.Inst.In.Name = :P1, C.Num = :P2, Name = S.Name )

32 Conclusion Could be reduced to the idea from last paper.
If we manage to move all class references out of the scope of the elim operator (as in Example), we can completely remove the duplicate elimination operation from the query. (Why?) Could be reduced to the idea from last paper. Note they are class references which will be implemented as tables.

33 Example Objective Achieved
Find the unique names of students taking some course taught at the same time as some other course numbered: P2 with an instructor in the department named: P1. Select distinct S.SNAME as NAME From STUDENT as S, TAKES as T, COURSE as C Where S=T.S and C.Time=T.C.Time and C.Inst.In.Name =:P1 and C.Num=:P2 Select S.SNAME as NAME

34 Laws Involving Duplicate Elimination
We can also move the δ to either or both of the arguments of an intersection: But not across the operators 34

35 Laws Involving Duplicate Elimination
δ(R) = R if R has no duplicates. Important cases of such a relation R include: A stored relation with a declared primary key. b) A relation that is the result of a grouping operation, since grouping creates a relation with no duplicates. 35

36 Thank you!


Download ppt "Dealing with Uniqueness Constraint in Query Optimization"

Similar presentations


Ads by Google