Download presentation
Presentation is loading. Please wait.
Published byDarrell Carr Modified over 9 years ago
1
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control Distributed Query Processing ➡ Overview ➡ Query decomposition and localization ➡ Distributed query optimization Multidatabase query processing Distributed Transaction Management Data Replication Parallel Database Systems Distributed Object DBMS Peer-to-Peer Data Management Web Data Management Current Issues
2
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/2 Step 1 – Query Decomposition Input : Calculus query on global relations Normalization ➡ manipulate query quantifiers and qualification Analysis ➡ detect and reject “incorrect” queries ➡ possible for only a subset of relational calculus Simplification ➡ eliminate redundant predicates Restructuring ➡ calculus query algebraic query ➡ more than one translation is possible ➡ use transformation rules
3
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/3 Normalization Lexical and syntactic analysis ➡ check validity (similar to compilers) ➡ check for attributes and relations ➡ type checking on the qualification Put into normal form ➡ Conjunctive normal form ( p 11 p 12 … p 1 n ) … ( p m 1 p m 2 … p mn ) ➡ Disjunctive normal form ( p 11 p 12 … p 1 n ) … ( p m 1 p m 2 … p mn ) ➡ OR's mapped into union ➡ AND's mapped into join or selection
4
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/4 Analysis Refute incorrect queries Type incorrect ➡ If any of its attribute or relation names are not defined in the global schema ➡ If operations are applied to attributes of the wrong type Semantically incorrect ➡ Components do not contribute in any way to the generation of the result ➡ Only a subset of relational calculus queries can be tested for correctness ➡ Those that do not contain disjunction and negation ➡ To detect ✦ connection graph (query graph) ✦ join graph
5
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/5 Analysis – Example SELECTENAME,RESP FROMEMP, ASG, PROJ WHEREEMP.ENO = ASG.ENO AND ASG.PNO = PROJ.PNO ANDPNAME = "CAD/CAM" ANDDUR ≥ 36 ANDTITLE = "Programmer" Query graph Join graph DUR≥36 PNAME=“CAD/CAM” ENAME EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO RESULT TITLE = “Programmer” RESP ASG.PNO=PROJ.PNO EMP.ENO=ASG.ENO ASGPROJEMP PROJASG
6
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/6 Analysis If the query graph is not connected, the query may be wrong or use Cartesian product SELECTENAME,RESP FROMEMP, ASG, PROJ WHEREEMP.ENO = ASG.ENO ANDPNAME = "CAD/CAM" ANDDUR > 36 ANDTITLE = "Programmer" PNAME=“CAD/CAM” ENAME RESULT RESP ASGPROJEMP
7
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/7 Simplification Why simplify? ➡ Remember the example How? Use transformation rules ➡ Elimination of redundancy ✦ idempotency rules p 1 ¬( p 1 ) false p 1 ( p 1 p 2 ) p 1 p 1 false p 1 … ➡ Application of transitivity ➡ Use of integrity rules
8
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/8 Simplification – Example SELECTTITLE FROMEMP WHEREEMP.ENAME = "J. Doe" OR(NOT(EMP.TITLE = "Programmer") AND(EMP.TITLE = "Programmer" OREMP.TITLE = "Elect. Eng.") ANDNOT(EMP.TITLE = "Elect. Eng.")) SELECTTITLE FROMEMP WHEREEMP.ENAME = "J. Doe"
9
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/9 Restructuring Convert relational calculus to relational algebra Make use of query trees Example Find the names of employees other than J. Doe who worked on the CAD/CAM project for either 1 or 2 years. SELECTENAME FROMEMP, ASG, PROJ WHEREEMP.ENO = ASG.ENO ANDASG.PNO = PROJ.PNO ANDENAME≠ "J. Doe" ANDPNAME = "CAD/CAM" AND(DUR = 12 OR DUR = 24) ENAME σ DUR=12 OR DUR=24 σ PNAME=“CAD/CAM” σ ENAME≠“J. DOE” PROJASGEMP Project Select Join ⋈ PNO ⋈ ENO
10
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/10 Restructuring –Transformation Rules Commutativity of binary operations ➡ R × S S × R ➡ R ⋈ S S ⋈ R ➡ R S S R Associativity of binary operations ➡ ( R × S ) × T R × ( S × T ) ➡ ( R ⋈ S ) ⋈ T R ⋈ ( S ⋈ T ) Idempotence of unary operations ➡ A ’ ( A ’ ( R )) A ’ ( R ) ➡ p 1 ( A 1 ) ( p 2 ( A 2 ) ( R )) p 1 ( A 1 ) p 2 ( A 2 ) ( R ) where R [ A ] and A' A, A" A and A' A" Commuting selection with projection
11
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/11 Restructuring – Transformation Rules Commuting selection with binary operations ➡ p ( A ) ( R × S ) ( p ( A ) ( R )) × S ➡ p ( A i ) ( R ⋈ ( A j,B k ) S ) ( p ( A i ) ( R )) ⋈ ( A j,B k ) S ➡ p ( A i ) ( R T ) p ( A i ) ( R ) p ( A i ) ( T ) where A i belongs to R and T Commuting projection with binary operations ➡ C ( R × S ) A ’ ( R ) × B ’ ( S ) ➡ C ( R ⋈ ( A j,B k ) S ) A ’ ( R ) ⋈ ( A j,B k ) B ’ ( S ) ➡ C ( R S ) C ( R ) C ( S ) where R [ A ] and S [ B ]; C = A ' B ' where A' A, B' B
12
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/12 Example Recall the previous example: Find the names of employees other than J. Doe who worked on the CAD/CAM project for either one or two years. SELECT ENAME FROMPROJ, ASG, EMP WHEREASG.ENO=EMP.ENO ANDASG.PNO=PROJ.PNO ANDENAME ≠ "J. Doe" ANDPROJ.PNAME="CAD/CAM" AND(DUR=12 OR DUR=24) ENAME DUR=12 DUR=24 PNAME=“CAD/CAM” ENAME≠“J. DOE” PROJASGEMP Project Select Join ⋈ PNO ⋈ ENO
13
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/13 Equivalent Query ENAME PNAME=“CAD/CAM” (DUR=12 DUR=24) ENAME≠“J. Doe” × PROJ ASG EMP ⋈ PNO,ENO
14
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/14 EMP ENAME ENAME ≠ "J. Doe" ASGPROJ PNO,ENAME PNAME = "CAD/CAM" PNO DUR =12 DUR=24 PNO,ENO PNO,ENAME Restructuring ⋈ PNO ⋈ ENO
15
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/15 Step 2 – Data Localization Input: Algebraic query on distributed relations Determine which fragments are involved Localization program ➡ substitute for each global query its materialization program ➡ optimize
16
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/16 Example Assume ➡ EMP is fragmented into EMP 1, EMP 2, EMP 3 as follows: ✦ EMP 1 = ENO≤“E3” (EMP) ✦ EMP 2 = “E3”<ENO≤“E6” (EMP) ✦ EMP 3 = ENO≥“E6 ” (EMP) ➡ ASG fragmented into ASG 1 and ASG 2 as follows: ✦ ASG 1 = ENO≤“E3” (ASG) ✦ ASG 2 = ENO>“E3” (ASG) Replace EMP by (EMP 1 EMP 2 EMP 3 ) and ASG by (ASG 1 ASG 2 ) in any query ENAME DUR=12 DUR=24 PNAME=“CAD/CAM” ENAME≠“J. DOE” PROJ EMP 1 EMP 2 EMP 3 ASG 1 ASG 2 ⋈ PNO ⋈ ENO
17
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/17 Provides Parallellism EMP 3 ASG 1 EMP 2 ASG 2 EMP 1 ASG 1 EMP 3 ASG 2 ⋈ ENO
18
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/18 Eliminates Unnecessary Work EMP 2 ASG 2 EMP 1 ASG 1 EMP 3 ASG 2 ⋈ ENO
19
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/19 Reduction for PHF Reduction with selection ➡ Relation R and F R ={ R 1, R 2, …, R w } where R j = p j ( R ) p i ( R j )= if x in R : ¬( p i ( x ) p j ( x )) ➡ Example SELECT* FROMEMP WHEREENO="E5" ENO=“E5” EMP 1 EMP 2 EMP 3 EMP 2 ENO=“E5”
20
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/20 Reduction for PHF Reduction with join ➡ Possible if fragmentation is done on join attribute ➡ Distribute join over union ( R 1 R 2 ) ⋈ S ( R 1 ⋈ S ) ( R 2 ⋈ S ) ➡ Given R i = p i ( R ) and R j = p j ( R ) R i ⋈ R j = if x in R i, y in R j : ¬( p i ( x ) p j ( y ))
21
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/21 Reduction for PHF Assume EMP is fragmented as before and ➡ ASG 1 : ENO ≤ "E3" (ASG) ➡ ASG 2 : ENO > "E3" (ASG) Consider the query SELECT* FROMEMP,ASG WHEREEMP.ENO=ASG.ENO Distribute join over unions Apply the reduction rule EMP 1 EMP 2 EMP 3 ASG 1 ASG 2 ⋈ ENO EMP 1 ASG 1 EMP 2 ASG 2 EMP 3 ASG 2 ⋈ ENO
22
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/22 Reduction for VF Find useless (not empty) intermediate relations Relation R defined over attributes A = { A 1,..., A n } vertically fragmented as R i = A ' ( R ) where A ' A : D,K ( R i ) is useless if the set of projection attributes D is not in A ' Example: EMP 1 = ENO,ENAME (EMP); EMP 2 = ENO,TITLE (EMP) SELECTENAME FROMEMP EMP 1 EMP 2 ENAME ⋈ ENO ENAME
23
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/23 Reduction for DHF Rule : ➡ Distribute joins over unions ➡ Apply the join reduction for horizontal fragmentation Example ASG 1 : ASG ⋉ ENO EMP 1 ASG 2 : ASG ⋉ ENO EMP 2 EMP 1 : TITLE=“Programmer” (EMP) EMP 2 : TITLE=“Programmer” (EMP) Query SELECT * FROMEMP, ASG WHEREASG.ENO = EMP.ENO ANDEMP.TITLE = "Mech. Eng."
24
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/24 Generic query Selections first Reduction for DHF ASG 1 TITLE=“Mech. Eng.” ASG 2 EMP 1 EMP 2 ASG 1 ASG 2 EMP 2 TITLE=“Mech. Eng.” ⋈ ENO
25
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/25 Joins over unions Reduction for DHF Elimination of the empty intermediate relations (left sub-tree) ASG 1 EMP 2 TITLE=“Mech. Eng.” ASG 2 TITLE=“Mech. Eng.” ASG 2 EMP 2 TITLE=“Mech. Eng.” ⋈ ENO
26
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/26 Reduction for Hybrid Fragmentation Combine the rules already specified: ➡ Remove empty relations generated by contradicting selections on horizontal fragments; ➡ Remove useless relations generated by projections on vertical fragments; ➡ Distribute joins over unions in order to isolate and remove useless joins.
27
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/27 Reduction for HF Example Consider the following hybrid fragmentation: EMP 1 = ENO≤"E4" ( ENO,ENAME (EMP)) EMP 2 = ENO>"E4" ( ENO,ENAME (EMP)) EMP 3 = ENO,TITLE (EMP) and the query SELECTENAME FROMEMP WHEREENO="E5" EMP 1 EMP 2 EMP 3 ENO=“E5” ENAME EMP 2 ENO=“E5” ENAME ⋈ ENO
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.