Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.

Similar presentations


Presentation on theme: "Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control."— Presentation transcript:

1 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control Distributed Query Processing ➡ Overview ➡ Query decomposition and localization ➡ Distributed query optimization Multidatabase query processing Distributed Transaction Management Data Replication Parallel Database Systems Distributed Object DBMS Peer-to-Peer Data Management Web Data Management Current Issues

2 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/2 Step 1 – Query Decomposition Input : Calculus query on global relations Normalization ➡ manipulate query quantifiers and qualification Analysis ➡ detect and reject “incorrect” queries ➡ possible for only a subset of relational calculus Simplification ➡ eliminate redundant predicates Restructuring ➡ calculus query   algebraic query ➡ more than one translation is possible ➡ use transformation rules

3 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/3 Normalization Lexical and syntactic analysis ➡ check validity (similar to compilers) ➡ check for attributes and relations ➡ type checking on the qualification Put into normal form ➡ Conjunctive normal form ( p 11  p 12  …  p 1 n )  …  ( p m 1  p m 2  …  p mn ) ➡ Disjunctive normal form ( p 11  p 12  …  p 1 n )  …  ( p m 1  p m 2  …  p mn ) ➡ OR's mapped into union ➡ AND's mapped into join or selection

4 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/4 Analysis Refute incorrect queries Type incorrect ➡ If any of its attribute or relation names are not defined in the global schema ➡ If operations are applied to attributes of the wrong type Semantically incorrect ➡ Components do not contribute in any way to the generation of the result ➡ Only a subset of relational calculus queries can be tested for correctness ➡ Those that do not contain disjunction and negation ➡ To detect ✦ connection graph (query graph) ✦ join graph

5 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/5 Analysis – Example SELECTENAME,RESP FROMEMP, ASG, PROJ WHEREEMP.ENO = ASG.ENO AND ASG.PNO = PROJ.PNO ANDPNAME = "CAD/CAM" ANDDUR ≥ 36 ANDTITLE = "Programmer" Query graph Join graph DUR≥36 PNAME=“CAD/CAM” ENAME EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO RESULT TITLE = “Programmer” RESP ASG.PNO=PROJ.PNO EMP.ENO=ASG.ENO ASGPROJEMP PROJASG

6 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/6 Analysis If the query graph is not connected, the query may be wrong or use Cartesian product SELECTENAME,RESP FROMEMP, ASG, PROJ WHEREEMP.ENO = ASG.ENO ANDPNAME = "CAD/CAM" ANDDUR > 36 ANDTITLE = "Programmer" PNAME=“CAD/CAM” ENAME RESULT RESP ASGPROJEMP

7 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/7 Simplification Why simplify? ➡ Remember the example How? Use transformation rules ➡ Elimination of redundancy ✦ idempotency rules p 1  ¬( p 1 )  false p 1  ( p 1  p 2 )  p 1 p 1  false  p 1 … ➡ Application of transitivity ➡ Use of integrity rules

8 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/8 Simplification – Example SELECTTITLE FROMEMP WHEREEMP.ENAME = "J. Doe" OR(NOT(EMP.TITLE = "Programmer") AND(EMP.TITLE = "Programmer" OREMP.TITLE = "Elect. Eng.") ANDNOT(EMP.TITLE = "Elect. Eng."))  SELECTTITLE FROMEMP WHEREEMP.ENAME = "J. Doe"

9 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/9 Restructuring Convert relational calculus to relational algebra Make use of query trees Example Find the names of employees other than J. Doe who worked on the CAD/CAM project for either 1 or 2 years. SELECTENAME FROMEMP, ASG, PROJ WHEREEMP.ENO = ASG.ENO ANDASG.PNO = PROJ.PNO ANDENAME≠ "J. Doe" ANDPNAME = "CAD/CAM" AND(DUR = 12 OR DUR = 24)  ENAME σ DUR=12 OR DUR=24 σ PNAME=“CAD/CAM” σ ENAME≠“J. DOE” PROJASGEMP Project Select Join ⋈ PNO ⋈ ENO

10 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/10 Restructuring –Transformation Rules Commutativity of binary operations ➡ R ×  S  S  × R ➡ R ⋈ S  S ⋈ R ➡ R  S  S  R Associativity of binary operations ➡ ( R  × S )  × T  R  × ( S  × T ) ➡ ( R ⋈ S ) ⋈ T  R ⋈ ( S ⋈ T ) Idempotence of unary operations ➡  A ’ (  A ’ ( R ))  A ’ ( R ) ➡  p 1 ( A 1 ) (  p 2 ( A 2 ) ( R ))  p 1 ( A 1 )  p 2 ( A 2 ) ( R ) where R [ A ] and A'  A, A"  A and A'  A" Commuting selection with projection

11 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/11 Restructuring – Transformation Rules Commuting selection with binary operations ➡  p ( A ) ( R  × S )  (  p ( A ) ( R ))  × S ➡  p ( A i ) ( R ⋈ ( A j,B k ) S )  (  p ( A i ) ( R )) ⋈ ( A j,B k ) S ➡  p ( A i ) ( R  T )  p ( A i ) ( R )  p ( A i ) ( T ) where A i belongs to R and T Commuting projection with binary operations ➡  C ( R  × S )  A ’ ( R ) ×  B ’ ( S ) ➡  C ( R ⋈ ( A j,B k ) S )  A ’ ( R ) ⋈ ( A j,B k )  B ’ ( S ) ➡  C ( R  S )  C ( R )  C ( S ) where R [ A ] and S [ B ]; C = A '  B ' where A'  A, B'  B

12 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/12 Example Recall the previous example: Find the names of employees other than J. Doe who worked on the CAD/CAM project for either one or two years. SELECT ENAME FROMPROJ, ASG, EMP WHEREASG.ENO=EMP.ENO ANDASG.PNO=PROJ.PNO ANDENAME ≠ "J. Doe" ANDPROJ.PNAME="CAD/CAM" AND(DUR=12 OR DUR=24)  ENAME  DUR=12  DUR=24  PNAME=“CAD/CAM”  ENAME≠“J. DOE” PROJASGEMP Project Select Join ⋈ PNO ⋈ ENO

13 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/13 Equivalent Query  ENAME  PNAME=“CAD/CAM”  (DUR=12  DUR=24)  ENAME≠“J. Doe” × PROJ ASG EMP ⋈ PNO,ENO

14 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/14 EMP  ENAME  ENAME ≠ "J. Doe" ASGPROJ  PNO,ENAME  PNAME = "CAD/CAM"  PNO  DUR =12  DUR=24  PNO,ENO  PNO,ENAME Restructuring ⋈ PNO ⋈ ENO

15 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/15 Step 2 – Data Localization Input: Algebraic query on distributed relations Determine which fragments are involved Localization program ➡ substitute for each global query its materialization program ➡ optimize

16 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/16 Example Assume ➡ EMP is fragmented into EMP 1, EMP 2, EMP 3 as follows: ✦ EMP 1 =  ENO≤“E3” (EMP) ✦ EMP 2 =  “E3”<ENO≤“E6” (EMP) ✦ EMP 3 =  ENO≥“E6 ” (EMP) ➡ ASG fragmented into ASG 1 and ASG 2 as follows: ✦ ASG 1 =  ENO≤“E3” (ASG) ✦ ASG 2 =  ENO>“E3” (ASG) Replace EMP by (EMP 1  EMP 2  EMP 3 ) and ASG by (ASG 1  ASG 2 ) in any query  ENAME  DUR=12  DUR=24  PNAME=“CAD/CAM”  ENAME≠“J. DOE” PROJ   EMP 1 EMP 2 EMP 3 ASG 1 ASG 2 ⋈ PNO ⋈ ENO

17 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/17 Provides Parallellism EMP 3 ASG 1 EMP 2 ASG 2 EMP 1 ASG 1  EMP 3 ASG 2 ⋈ ENO

18 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/18 Eliminates Unnecessary Work EMP 2 ASG 2 EMP 1 ASG 1 EMP 3 ASG 2  ⋈ ENO

19 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/19 Reduction for PHF Reduction with selection ➡ Relation R and F R ={ R 1, R 2, …, R w } where R j =  p j ( R )  p i ( R j )=   if  x in R : ¬( p i ( x )  p j ( x )) ➡ Example SELECT* FROMEMP WHEREENO="E5"  ENO=“E5” EMP 1 EMP 2 EMP 3 EMP 2  ENO=“E5” 

20 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/20 Reduction for PHF Reduction with join ➡ Possible if fragmentation is done on join attribute ➡ Distribute join over union ( R 1  R 2 ) ⋈ S  ( R 1 ⋈ S )  ( R 2 ⋈ S ) ➡ Given R i =  p i ( R ) and R j =  p j ( R ) R i ⋈ R j =  if  x in R i,  y in R j : ¬( p i ( x )  p j ( y ))

21 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/21 Reduction for PHF Assume EMP is fragmented as before and ➡ ASG 1 :  ENO ≤ "E3" (ASG) ➡ ASG 2 :  ENO > "E3" (ASG) Consider the query SELECT* FROMEMP,ASG WHEREEMP.ENO=ASG.ENO Distribute join over unions Apply the reduction rule  EMP 1 EMP 2 EMP 3 ASG 1 ASG 2 ⋈ ENO  EMP 1 ASG 1 EMP 2 ASG 2 EMP 3 ASG 2 ⋈ ENO

22 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/22 Reduction for VF Find useless (not empty) intermediate relations Relation R defined over attributes A = { A 1,..., A n } vertically fragmented as R i =  A ' ( R ) where A '  A :  D,K ( R i ) is useless if the set of projection attributes D is not in A ' Example: EMP 1 =  ENO,ENAME (EMP); EMP 2 =  ENO,TITLE (EMP) SELECTENAME FROMEMP EMP 1 EMP 2  ENAME ⋈ ENO  ENAME

23 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/23 Reduction for DHF Rule : ➡ Distribute joins over unions ➡ Apply the join reduction for horizontal fragmentation Example ASG 1 : ASG ⋉ ENO EMP 1 ASG 2 : ASG ⋉ ENO EMP 2 EMP 1 :  TITLE=“Programmer” (EMP) EMP 2 :  TITLE=“Programmer” (EMP) Query SELECT * FROMEMP, ASG WHEREASG.ENO = EMP.ENO ANDEMP.TITLE = "Mech. Eng."

24 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/24 Generic query Selections first Reduction for DHF   ASG 1  TITLE=“Mech. Eng.” ASG 2 EMP 1 EMP 2  ASG 1 ASG 2 EMP 2  TITLE=“Mech. Eng.” ⋈ ENO

25 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/25 Joins over unions Reduction for DHF Elimination of the empty intermediate relations (left sub-tree)  ASG 1 EMP 2  TITLE=“Mech. Eng.” ASG 2  TITLE=“Mech. Eng.” ASG 2 EMP 2  TITLE=“Mech. Eng.” ⋈ ENO

26 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/26 Reduction for Hybrid Fragmentation Combine the rules already specified: ➡ Remove empty relations generated by contradicting selections on horizontal fragments; ➡ Remove useless relations generated by projections on vertical fragments; ➡ Distribute joins over unions in order to isolate and remove useless joins.

27 Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/27 Reduction for HF Example Consider the following hybrid fragmentation: EMP 1 =  ENO≤"E4" (  ENO,ENAME (EMP)) EMP 2 =  ENO>"E4" (  ENO,ENAME (EMP)) EMP 3 =   ENO,TITLE (EMP) and the query SELECTENAME FROMEMP WHEREENO="E5" EMP 1 EMP 2  EMP 3  ENO=“E5”  ENAME EMP 2  ENO=“E5”  ENAME  ⋈ ENO


Download ppt "Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control."

Similar presentations


Ads by Google