Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li.

Similar presentations


Presentation on theme: "1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li."— Presentation transcript:

1 1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li

2 ICS214BNotes 092 Which simple predicates should we use in Pr? Desired property of Pr: - minimality - uniformity

3 ICS214BNotes 093 Return to example: E(#, NM, LOC, SAL,…) Common queries: Qa: select *Qb: select * from Efrom E where LOC=Sawhere LOC=Sb and …and...

4 ICS214BNotes 094 Three choices: (1) Pr = { } F 1 ={ E } (2) Pr = {LOC=Sa, LOC=Sb} F 2 ={  loc=Sa E,  loc=Sb E } (3) Pr = {LOC=Sa, LOC=Sb, Sal<10} F 3 ={  loc=Sa  sal<10 E,  loc=Sa  sal  10 E,  loc=Sb  sal<10 E,  loc=Sb  sal  10 E }

5 ICS214BNotes 095 In other words: Loc=Sa  sal < 10 Loc=Sa  sal  10 Loc=Sb  sal < 10 Loc=Sb  sal  10 F1F1 F3F3 F2F2 Q a : Select … loc = S a... Q b : Select … loc = S b... F 2 is good… (not F 1, F 3 )

6 ICS214BNotes 096 Informal definition Set of predicates Pr is uniform if: for every F i  F [P r ], every t  F i has equal probability of access by every major application. Note: F [P r ] is fragmentation defined by minterm predicates generated by P r.

7 ICS214BNotes 097 Back to example: Loc=Sa  sal < 10 Loc=Sa  sal  10 Loc=Sb  sal < 10 Loc=Sb  sal  10 F1F1 Q a : Select … loc = S a... Q b : Select … loc = S b... tuples here have higher probability of access tuples here have lower probability of access so F 1 is not “good”...

8 ICS214BNotes 098 Back to example: Loc=Sa  sal < 10 Loc=Sa  sal  10 Loc=Sb  sal < 10 Loc=Sb  sal  10 F2F2 Q a : Select … loc = S a... Q b : Select … loc = S b... tuples here have same probability of access so F 2 is “good”... so is F 3...

9 ICS214BNotes 099 Informal definition Set of predicates P r is minimal if no P r’  P r is uniform

10 ICS214BNotes 0910 Back to example: (1) Pr = { } N (2) Pr = {LOC=Sa, LOC=Sb} Y (3) Pr = {LOC=Sa, LOC=Sb, Sal<10} N uniform? Pr(2) is a subset of Pr(3), so Pr(3) is not minimal...

11 ICS214BNotes 0911 Is P r uniform and minimal a good thing? Not necessarily! But it does simplify allocation problem...

12 ICS214BNotes 0912 Derived horizontal fragmentation E(ENO, NAME, SAL, LOC) J(ENO, DESCRIPTION,…) (Owner) (Member) Common query: Given an employee name, list projects (s)he works in E  F ={ E 1, E 2 } by LOC

13 ICS214BNotes 0913 E1E1 (at S a ) (at S b ) E2E2 J

14 ICS214BNotes 0914 E1E1 (at S a ) (at S b ) E2E2 J1J1 J2J2 J 1 = J E 1 J 2 = J E 2

15 ICS214BNotes 0915 Derived horizontal fragmentation R, F = { F 1, F 2,... F n }  S, D = {D 1, D 2, …D n } where D i =S F i Convention: R is owner S is member F could be primary or derived

16 ICS214BNotes 0916 Checking completeness and disjointness of derived fragmentation  But no #= 33 in E 1 nor in E 2 ! Example: Say J is: This J tuple will not be in J 1 nor J 2  Fragmentation not complete

17 ICS214BNotes 0917 To get completeness: Need to enforce referential integrity constraint: join attr(#) of member relation  join attr(#) of owner relation

18 ICS214BNotes 0918 Example: E1E1 E2E2 J1J1 J J2J2 Fragmentation is not disjoint!

19 ICS214BNotes 0919 To get disjointness: Join attribute(#) should be key of owner relation

20 ICS214BNotes 0920 Summary: horizontal fragmentation Type: primary, derived Properties: completeness, disjointness Predicates: minimal, uniform

21 ICS214BNotes 0921 Vertical fragmentation E1E1 E E2E2 Example:

22 ICS214BNotes 0922 R[T]  R 1 [T 1 ] T i  T R n [Tn] Just like normalization of relations...

23 ICS214BNotes 0923 Properties: R[T]  R i [T i ] (1) Completeness U Ti = T all i

24 ICS214BNotes 0924 (2) Disjointness Ti  Tj =  for all i,j i  j E(#,LOC,SAL) E 1 (#,LOC) E 2 (SAL) Not a desirable property!! (could not reconstruct R!)

25 ICS214BNotes 0925 (3) Reconstruction: Lossless join R i = R all i  One way to achieve lossless join: Repeat key in all fragments, i.e., Key  T i for all i

26 ICS214BNotes 0926 Hybrid Fragmentation R R1 R2 R11 R12R21 R22 Horizontal Vertical

27 ICS214BNotes 0927 Hybrid Fragmentation -- Reconstruction R11 R12R21 R22 Horizontal Vertical U

28 ICS214BNotes 0928 Allocation Example: E(#,NM,LOC,SAL)  F 1 =  loc=Sa E ; F 2 =  loc=Sb E Qa: select … where loc=Sa... Qb: select … where loc=Sb… Site a Site b Where do F 1,F 2 go? ?

29 ICS214BNotes 0929 Issues Where do queries originate? What is communication cost? and size of answers, relations,… What is storage capacity, cost at sites? and size of fragments? What is processing power at sites? What is query processing strategy? –How are joins done? –Where are answers collected?

30 ICS214BNotes 0930 Cost of updating copies? Writes and concurrency control?... Do we replicate fragments?

31 ICS214BNotes 0931 Optimization problem: What is best placement of fragments and/or best number of copies to: –minimize query response time –maximize throughput –minimize “some cost” –... Subject to constraints? –Available storage –Available bandwidth, power,… –Keep 90% of response time below X –... Often, can use common sense –Place fragments where they are most heavily accessed This is an incredibly hard problem

32 ICS214BNotes 0932 Summary Horizontal and vertical fragmentation Designing good fragmentations and allocation Next: Query processing in distributed databases

33 ICS214BNotes 0933 Query Query Plan Algebraic query tree on relations (1) Decomposition Algebraic query tree on relation fragments (2) Localization (3) Optimization

34 ICS214BNotes 0934 Decomposition Same as in centralized system –Normalization –Eliminating redundancy –Algebraic rewriting

35 ICS214BNotes 0935 Normalization Convert from query language to relational algebra

36 ICS214BNotes 0936 Example SELECT R.A, S.D FROM R, S WHERE (R.B=1 and S.C=2) and (R.A = S.A)  R.A,S. D  (R.B=1  S.C=2) R S (R.A = S.A)

37 ICS214BNotes 0937 Eliminate redundancy E.g.: in conditions: (S.A=1)  (S.A>5)  False (S.A<10)  (S.A<5)  S.A<5

38 ICS214BNotes 0938 E.g.: Common sub-expressions UU S  cond  cond T S  cond T RRR

39 ICS214BNotes 0939 Algebraic rewriting E.g.: Push conditions down  cond3  cond  cond1  cond2 RS R S

40 ICS214BNotes 0940 Query Algebraic query tree on relations (1) Decomposition Algebraic query tree on relation fragments (2) Localization

41 ICS214BNotes 0941 Localization steps (1) Start with query tree (2) Replace relations by fragments (3) Push  : up ,  : down (4) Simplify – eliminate unnecessary operations

42 ICS214BNotes 0942 Notation for fragment [R: cond] fragment conditions its tuples satisfy

43 ICS214BNotes 0943 Example A (1)  E=3 R

44 ICS214BNotes 0944 (2)  E=3  [R 1 : E < 10] [R 2 : E  10]

45 ICS214BNotes 0945 (3)   E=3 [R 1 : E < 10] [R 2 : E  10] ØØ

46 ICS214BNotes 0946 (4)  E=3 [R 1 : E < 10]

47 ICS214BNotes 0947 Rule 1  C1 [R: c2]   C1 [R: c1  c2] [R: False]  Ø A B

48 ICS214BNotes 0948 In example A:  E=3 [R 2 : E  10]  [  E=3 R 2 : E=3  E  10]  [  E=3 R 2 : False]  Ø

49 ICS214BNotes 0949 Example B (1)A=common attribute RS A

50 ICS214BNotes 0950 (2)  A [S 1 : A<5] [S 2 : A  5] [R 1 : A 10]

51 ICS214BNotes 0951 (3)  [R 1 :A<5][S 1 :A<5] [R 1 :A<5][S 2 :A  5] [R 2 :5  A  10][S 1 :A<5] [R 2 :5  A  10][S 2 :A  5] [R 3 :A>10][S 1 :A 10][S 2 :A  5] AAAAAA

52 ICS214BNotes 0952 (4)  [R 1 :A<5][S 1 :A<5] [R 2 :5  A  10][S 2 :A  5] AAA [R 3 :A>10][S 2 :A  5]

53 ICS214BNotes 0953 Rule 2 [R: C 1 ] [S: C 2 ]  [R S: C 1  C 2  R.A = S.A] AA

54 ICS214BNotes 0954 In step 4 of Example B: [R 1 : A<5] [S 2 : A  5]  [R 1 S 2 : R 1.A < 5  S 2.A  5  R 1.A = S 2.A ]  [R 1 S 2 : False]  Ø AAA

55 ICS214BNotes 0955 Localization with derived fragmentation Example C (2) [R 1 :A<10][R 2  10] S 1 :K=R.K S 2 :K=R.K  R.A<10  R.A  10 K

56 ICS214BNotes 0956 (3)  [R 1 ][S 1 ] [R 1 ][S 2 ] [R 2 ][S 1 ] [R 2 ][S 2 ] KKKK

57 ICS214BNotes 0957 (4)  [R 1 ][S 1 ] [R 2 ][S 2 ] KK

58 ICS214BNotes 0958 In step 3 of Example C: [R 1 :A<10] [S 2 :K=R.K  R.A  10]  [R 1 S 2: R 1.A<10  S 2.K=R.K  R.A  10  R 1.K= S 2.K]  [R 1 S 2 :False ] (K is key of R, R 1 )  Ø KKK

59 ICS214BNotes 0959 Localization with vertical fragmentation Example D (1)  A R 1 (K, A, B) R R 2 (K, C, D)

60 ICS214BNotes 0960 (2)  A R 1 R 2 (K, A, B) (K, C, D) K

61 ICS214BNotes 0961 (3)  A  K,A R 1 R 2 (K, A, B) (K, C, D) K not really needed

62 ICS214BNotes 0962 (4)  A R 1 (K, A, B)

63 ICS214BNotes 0963 Rule 3 Given vertical fragmentation of R: R i =  Ai (R), Ai  A Then for any B  A:  B (R) =  B [ R i | B  A i  Ø ] i

64 ICS214BNotes 0964 Localization with hybrid fragmentation Example E R 1 =  k<5 [  k,A R] R 2 =  k  5 [  k,A R] R 3 =  k,B R

65 ICS214BNotes 0965 Query:  A  k=3 R

66 ICS214BNotes 0966 Reduced Query:  A  k=3 R 1

67 ICS214BNotes 0967 Distributed Query Processing Decomposition  Localization  Optimization  next


Download ppt "1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li."

Similar presentations


Ads by Google